首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 355 毫秒
1.
Various models have been used for genomic prediction. Bayesian variable selection models often predict more accurate genomic breeding values than genomic BLUP (GBLUP), but GBLUP is generally preferred for routine genomic evaluations because of low computational demand. The objective of this study was to achieve the benefits of both models using results from Bayesian models and genome-wide association studies as weights on single nucleotide polymorphism (SNP) markers when constructing the genomic matrix (G-matrix) for genomic prediction. The data comprised 5,221 progeny-tested bulls from the Nordic Holstein population. The animals were genotyped using the Illumina Bovine SNP50 BeadChip (Illumina Inc., San Diego, CA). Weighting factors in this investigation were the posterior SNP variance, the square of the posterior SNP effect, and the corresponding minus base-10 logarithm of the marker association P-value [−log10(P)] of a t-test obtained from the analysis using a Bayesian mixture model with 4 normal distributions, the square of the estimated SNP effect, and the corresponding −log10(P) of a t-test obtained from the analysis using a classical genome-wide association study model (linear regression model). The weights were derived from the analysis based on data sets that were 0, 1, 3, or 5 yr before performing genomic prediction. In building a G-matrix, the weights were assigned either to each marker (single-marker weighting) or to each group of approximately 5 to 150 markers (group-marker weighting). The analysis was carried out for milk yield, fat yield, protein yield, fertility, and mastitis. Deregressed proofs (DRP) were used as response variables to predict genomic estimated breeding values (GEBV). Averaging over the 5 traits, the Bayesian model led to 2.0% higher reliability of GEBV than the GBLUP model with an original unweighted G-matrix. The superiority of using a GBLUP with weighted G-matrix over GBLUP with an original unweighted G-matrix was the largest when using a weighting factor of posterior variance, resulting in 1.7 percentage points higher reliability. The second best weighting factors were −log10 (P-value) of a t-test corresponding to the square of the posterior SNP effect from the Bayesian model and −log10 (P-value) of a t-test corresponding to the square of the estimated SNP effect from the linear regression model, followed by the square of estimated SNP effect and the square of the posterior SNP effect. In addition, group-marker weighting performed better than single-marker weighting in terms of reducing bias of GEBV, and also slightly increased prediction reliability. The differences between weighting factors and scenarios were larger in prediction bias than in prediction accuracy. Finally, weights derived from a data set having a lag up to 3 yr did not reduce reliability of GEBV. The results indicate that posterior SNP variance estimated from a Bayesian mixture model is a good alternative weighting factor, and common weights on group markers with a size of 30 markers is a good strategy when using markers of the 50,000-marker (50K) chip. In a population with gradually increasing reference data, the weights can be updated once every 3 yr.  相似文献   

2.
Test-day traits are important for genetic evaluation in dairy cattle and are better modeled by multiple-trait random regression models (RRM). The reliability and bias of genomic estimated breeding values (GEBV) predicted using multiple-trait RRM via single-step genomic best linear unbiased prediction (ssGBLUP) were investigated in the 3 major dairy cattle breeds in Canada (i.e., Ayrshire, Holstein, and Jersey). Individual additive genomic random regression coefficients for the test-day traits were predicted using 2 multiple-trait RRM: (1) one for milk, fat, and protein yields in the first, second, and third lactations, and (2) one for somatic cell score in the first, second, and third lactations. The predicted coefficients were used to derive GEBV for each lactation day and, subsequently, the daily GEBV were compared with traditional daily parent averages obtained by BLUP. To ensure compatibility between pedigree and genomic information for genotyped animals, different scaling factors for combining the inverse of genomic (G?1) and pedigree (A?122) relationship matrices were tested. In addition, the inclusion of only genotypes from animals with accurate breeding values (defined in preliminary analysis) was compared with the inclusion of all available genotypes in the analyzes. The ssGBLUP model led to considerably larger validation reliabilities than the BLUP model without genomic information. In general, scaling factors used to combine the G?1 and A?122 matrices had small influence on the validation reliabilities. However, a greater effect was observed in the inflation of GEBV. Less inflated GEBV were obtained by the ssGBLUP compared with the parent average from traditional BLUP when using optimal scaling factors to combine the G?1 and A?122 matrices. Similar results were observed when including either all available genotypes or only genotypes from animals with accurate breeding values. These findings indicate that ssGBLUP using multiple-trait RRM increases reliability and reduces bias of breeding values of young animals when compared with parent average from traditional BLUP in the Canadian Ayrshire, Holstein, and Jersey breeds.  相似文献   

3.
《Journal of dairy science》2022,105(2):923-939
Single-step genomic BLUP (ssGBLUP) is a method for genomic prediction that integrates matrices of pedigree (A) and genomic (G) relationships into a single unified additive relationship matrix whose inverse is incorporated into a set of mixed model equations (MME) to compute genomic predictions. Pedigree information in dairy cattle is often incomplete. Missing pedigree potentially causes biases and inflation in genomic estimated breeding values (GEBV) obtained with ssGBLUP. Three major issues are associated with missing pedigree in ssGBLUP, namely biased predictions by selection, missing inbreeding in pedigree relationships, and incompatibility between G and A in level and scale. These issues can be solved using a proper model for unknown-parent groups (UPG). The theory behind the use of UPG is well established for pedigree BLUP, but not for ssGBLUP. This study reviews the development of the UPG model in pedigree BLUP, the properties of UPG models in ssGBLUP, and the effect of UPG on genetic trends and genomic predictions. Similarities and differences between UPG and metafounder (MF) models, a generalized UPG model, are also reviewed. A UPG model (QP) derived using a transformation of the MME has a good convergence behavior. However, with insufficient data, the QP model may yield biased genetic trends and may underestimate UPG. The QP model can be altered by removing the genomic relationships linking GEBV and UPG effects from MME. This altered QP model exhibits less bias in genetic trends and less inflation in genomic predictions than the QP model, especially with large data sets. Recently, a new model, which encapsulates the UPG equations into the pedigree relationships for genotyped animals, was proposed in simulated purebred populations. The MF model is a comprehensive solution to the missing pedigree issue. This model can be a choice for multibreed or crossbred evaluations if the data set allows the estimation of a reasonable relationship matrix for MF. Missing pedigree influences genetic trends, but its effect on the predictability of genetic merit for genotyped animals should be negligible when many proven bulls are genotyped. The SNP effects can be back-solved using GEBV from older genotyped animals, and these predicted SNP effects can be used to calculate GEBV for young-genotyped animals with missing parents.  相似文献   

4.
《Journal of dairy science》2022,105(6):5221-5237
Approximate multistep methods to calculate reliabilities for estimated breeding values in large genetic evaluations were developed for single-trait (ST-R2A) and multitrait (MT-R2A) single-step genomic BLUP (ssGBLUP) models. First, a traditional animal model was used to estimate the amount of nongenomic information for the genotyped animals. Second, this information was used with genomic data in a genomic BLUP model (genomic BLUP/SNP-BLUP) to approximate the total amount of information and ssGBLUP reliabilities for the genotyped animals. Finally, reliabilities for the nongenotyped animals were calculated using a traditional animal model where the increased information due to genomic data for the genotyped animals is accounted for by including pseudo-record counts for the genotyped animals. The approaches were tested using a multiple-trait ssGBLUP model on 2 data sets. The first data set (data 1) was small enough such that exact ssGBLUP model reliabilities could be computed by inversion and compared with the approximation method reliabilities. Data 1 had 46,535 first-, 35,290 second-, and 23,780 third-lactation 305-d milk yield records from 47,124 Finnish Red dairy cows. The pedigree comprised 64,808 animals, of which 19,757 were genotyped. We examined the efficiency of the MT-R2A approximation on a large data set (data 2) derived from the joint Nordic (Danish, Finnish, and Swedish) Holstein dairy cattle data. Data 2 had 17.8 million 305-d milk records from 8.3 million cows and first 3 lactations. The pedigree had 11 million animals of which 274,145 were genotyped on 46,342 SNP markers. For data 1, correlations between the exact ssGBLUP model and the ST-R2A for the genotyped (nongenotyped) animals were 0.995 (0.987), 0.965 (0.984), and 0.950 (0.983) for first, second, and third lactation, respectively. Correspondingly, correlations between exact ssGBLUP reliabilities and MT-R2A for the genotyped (nongenotyped) animals were 0.995 (0.993), 0.992 (0.991), and 0.990 (0.990) for first, second, and third lactation, respectively. The regression coefficients (b1) of ssGBLUP reliability on ST-R2A for the genotyped (nongenotyped) animals ranged from 0.87 (0.94) for first lactation to 0.68 (0.93) for third lactation, whereas for MT-R2A they were between 0.91 (0.99) for first lactation to 0.89 (0.99) for third lactation. Correspondingly, the intercepts varied from 0.11 (0.05) to 0.3 (0.06) for ST-R2A and from 0.06 (0.01) to 0.07 (0.02) for MT-R2A. The computing time for the approximation method was approximately 12% of that required by the direct exact approach. In conclusion, the developed approximate approach allows calculating estimated breeding value reliabilities in the ssGBLUP model even for large data sets.  相似文献   

5.
《Journal of dairy science》2021,104(11):11779-11789
Selection based on genomic predictions has become the method of choice for genetic improvement in dairy cattle. This offers huge opportunity for developing countries with little or no pedigree data, and preliminary studies have shown promising results. The African Dairy Genetic Gains (ADGG) project initiated a digital system of dairy performance data collection, accompanied by genotyping in Tanzania in 2016. Currently, ADGG has the largest body of dairy performance data generated in East Africa from a smallholder dairy system. This study examines the use of genomic best linear unbiased prediction (GBLUP) and single-step (ss)GBLUP for the estimation of genetic parameters and accuracy of genomic prediction for daily milk yield and body weight in Tanzania. The estimates of heritability for daily milk yield from GBLUP and ssGBLUP were essentially the same, at 0.12 ± 0.03. The heritability estimates for daily milk yield averaged over the whole lactation from random regression model (RRM) GBLUP or ssGBLUP were 0.22 and 0.24, respectively. The heritability of body weight from GBLUP was 0.24 ± 04 but was 0.22 ± 04 from the ssGBLUP analysis. Accuracy of genomic prediction for milk yield from a forward validation was 0.57 for GBLUP based on fixed regression model or 0.55 from an RRM. Corresponding estimates from ssGBLUP were 0.59 and 0.53, respectively. Accuracy for body weight, however, was much higher at 0.83 from GBLUP and 0.77 for ssGBLUP. The moderate to high levels of accuracy of genomic prediction (0.53–0.83) obtained for milk yield and body weight indicate that selection on the basis of genomic prediction is feasible in smallholder dairy systems and most probably the only initial possible pathway to implementing sustained genetic improvement programs in such systems.  相似文献   

6.
Methods for genomic prediction were evaluated for an Israeli Holstein dairy population of 713,686 cows and 1,305 progeny-tested bulls with genotypes. Inclusion of genotypes of 343 elite cows in an evaluation method that considers pedigree, phenotypes, and genotypes simultaneously was also evaluated. Two data sets were available: a complete data set with production records from 1985 through 2011, and a reduced data set with records after 2006 deleted. For each production trait, a multitrait animal model was used to compute traditional genetic evaluations for parities 1 through 3 as separate traits. Evaluations were calculated for the reduced and complete data sets. The evaluations from the reduced data set were used to calculate parent average for validation bulls, which was the benchmark for comparing gain in predictive ability from genomics. Genomic predictions for bulls in 2006 were calculated using a Bayesian regression method (BayesC), genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP), and weighted ssGBLUP (WssGBLUP). Predictions using BayesC and GBLUP were calculated either with or without an index that included parent average. Genomic predictions that included elite cow genotypes were calculated using ssGBLUP and WssGBLUP. Predictive ability was assessed by coefficients of determination (R2) and regressions of predictions of 135 validation bulls with no daughters in 2006 on deregressed evaluations of those bulls in 2011. A reduction in R2 and regression coefficients was observed from parities 1 through 3. Fat and protein yields had the lowest R2 for all the methods. On average, R2 was lowest for parent averages, followed by GBLUP, BayesC, ssGBLUP, and WssGBLUP. For some traits, R2 for direct genomic values from BayesC and GBLUP were lower than those for parent averages. Genomic estimated breeding values using ssGBLUP were the least biased, and this method appears to be a suitable tool for genomic evaluation of a small genotyped population, as it automatically accounts for parental index, allows for inclusion of female genomic information without preadjustments in evaluations, and uses the same model as in traditional evaluations. Weighted ssGBLUP has the potential for higher evaluation accuracy.  相似文献   

7.
The objective of this study was to compare genetic trends from single-step genomic BLUP (ssGBLUP) and traditional BLUP models for milk production traits of US Holsteins. Phenotypes were 305-d milk, fat, and protein yields from 21,527,040 cows recorded between January 1990 and August 2015. The pedigree file included 29,651,623 animals and was limited to 3 generations back from recorded or genotyped animals. Genotypes for 764,029 animals were used, and analyses were by a 3-trait repeatability model as used in the US official genetic evaluation. Unknown-parent groups were incorporated into the inverse of a relationship matrix (H?1 in ssGBLUP and A?1 in BLUP) with the QP transformation. For ssGBLUP, 18,359 genotyped animals were randomly chosen as core animals to calculate the inverse of the genomic relationship matrix with the APY algorithm. Computations took 6.5 h and 1.4 GB of memory for BLUP, and 13 h and 115 GB of memory for ssGBLUP. For genotyped sires with at least 10 daughters, the average genetic levels for predicted transmitting ability (PTA) and genomic PTA were similar up to 2008, with a higher level for ssGBLUP later (approximately by 36 kg for milk, 2.1 kg for fat, and 1.1 kg for protein for bulls born in 2010). For genotyped cows, the average genetic levels were similar up to 2006, with a higher level for ssGBLUP (approximately by 91 kg for milk, 3.6 kg for fat, and 2.7 kg for protein for cows born in 2012). For all cows, the average levels were slightly higher for ssGBLUP, with much smaller differences than for genotyped cows. Trends for BLUP indicate bias due to genomic preselection for genotyped sires and cows. For official evaluations released in December 2016, traditional PTA had the same trend as multiple-step genomic PTA for both genotyped bulls and cows except for the youngest bulls, who had traditional PTA slightly lower than genomic PTA. For genotyped bulls born in recent years, genetic gain for official traditional and genomic evaluations was similar in contrast to ssGBLUP and BLUP differences. Official PTA for cows were adjusted so that the Mendelian sampling variance was comparable with that for bulls, and those adjustments likely removed bias due to genomic preselection from traditional PTA, especially for genotyped cows. The ssGBLUP method seems to account partially for that bias and is computationally suitable for national evaluations.  相似文献   

8.
The observed low accuracy of genomic selection in multibreed and admixed populations results from insufficient linkage disequilibrium between markers and trait loci. Failure to remove variation due to the population structure may also hamper the prediction accuracy. We verified if accounting for breed origin of alleles in the calculation of genomic relationships would improve the prediction accuracy in an admixed population. Individual breed proportions derived from the pedigree were used to estimate breed-wise allele frequencies (AF). Breed-wise and across-breed AF were estimated from the currently genotyped population and also in the base population. Genomic relationship matrices (G) were subsequently calculated using across-breed (GAB) and breed-wise (GBW) AF estimated in the currently genotyped and also in the base population. Unified relationship matrices were derived by combining different G with pedigree relationships in the evaluation of genomic estimated breeding values (GEBV) for genotyped and ungenotyped animals. The validation reliabilities and inflation of GEBV were assessed by a linear regression of deregressed breeding value (deregressed proofs) on GEBV, weighted by the reliability of deregressed proofs. The regression coefficients (b1) from GAB ranged from 0.76 for milk to 0.90 for protein. Corresponding b1 terms from GBW ranged from 0.72 to 0.88. The validation reliabilities across 4 evaluations with different G were generally 36, 40, and 46% for milk, protein, and fat, respectively. Unexpectedly, validation reliabilities were generally similar across different evaluations, irrespective of AF used to compute G. Thus, although accounting for the population structure in GBW tends to simplify the blending of genomic- and pedigree-based relationships, it appeared to have little effect on the validation reliabilities.  相似文献   

9.
Fatty acid (FA) composition is one of the most important aspects of milk nutritional quality. However, the inclusion of this trait as a breeding goal for dairy species is hampered by the logistics and high costs of phenotype recording. Fourier-transform infrared spectroscopy (FTIR) is a valid and cheap alternative to laboratory gas chromatography (GC) for predicting milk FA composition. Moreover, as for other novel phenotypes, the efficiency of selection for these traits can be enhanced by using genomic data. The objective of this research was to compare traditional versus genomic selection approaches for estimating genetic parameters and breeding values of milk fatty acid composition in dairy sheep using either GC-measured or FTIR-predicted FA as phenotypes. Milk FA profiles were available for a total of 923 Sarda breed ewes. The youngest 100 had their own phenotype masked to mimic selection candidates. Pedigree relationship information and genotypes were available for 923 and 769 ewes, respectively. Three statistical approaches were used: the classical-pedigree-based BLUP, the genomic BLUP that considers the genomic relationship matrix G, and the single-step genomic BLUP (ssGBLUP) where pedigree and genomic relationship matrices are blended into a single H matrix. Heritability estimates using pedigree were lower than ssGBLUP, and very similar between GC and FTIR regarding the statistical approach used. For some FA, mostly associated with animal diet (i.e., C18:2n-6, C18:3n-3), random effect of combination of flock and test date explained a relevant quota of total variance, reducing the heritability estimates accordingly. Genomic approaches (genomic BLUP and ssGBLUP) outperformed the traditional pedigree method both for GC and FTIR FA. Prediction accuracies in the older cohort were larger than the young cohort. Genomic prediction accuracies (obtained using either G or H relationship matrix) in the young cohort of animals, where their own phenotypes were masked, were similar for GC and FTIR. Multiple-trait analysis slightly affected genomic breeding value accuracies. These results suggest that FTIR-predicted milk FA composition could represent a valid option for inclusion in breeding programs.  相似文献   

10.
《Journal of dairy science》2019,102(8):7237-7247
Relatedness between reference and test animals has an important effect on the reliability of genomic prediction for test animals. Because genomic prediction has been widely applied in practical cattle breeding and bulls have been selected according to genomic breeding value without progeny testing, the sires or grandsires of candidates might not have phenotypic information and might not be in the reference population when the candidates are selected. The objective of this study was to investigate the decreasing trend of the reliability of genomic prediction given distant reference populations, using genomic best linear unbiased prediction (GBLUP) and Bayesian variable selection models with or without including the quantitative trait locus (QTL) markers detected from sequencing data. The data used in this study consisted of 22,242 bulls genotyped using the 54K SNP array from EuroGenomics. Among them, 1,444 Danish bulls born from 2006 to 2010 were selected as test animals. Different reference populations with varying relationships to test animals were created according to pedigree-based relationships. The reference individuals having a relationship with one or more test animals higher than 0.4 (scenario ρ < 0.4), 0.2 (ρ < 0.2), or 0.1 (ρ < 0.1, where ρ = relationship coefficient) were removed from reference sets; these represented the distance between reference and test animals being 2 generations, 3 generations, and 4 generations, respectively. Imputed whole-genome sequencing data of bulls from Denmark were used to conduct a genome-wide association study (GWAS). A small number of significant variants (QTL markers) from the GWAS were added to the array data. To compare the effects of different models, the basic GBLUP model, a Bayesian selection variable model, a GBLUP model with 2 components of genetic effects, and a Bayesian model with pooled array data and QTL markers were used for estimating genomic estimated breeding values (GEBV) of test animals. The reliability of genomic prediction decreased when the test animals were more generations away from the reference population. The reliability of genomic prediction was 0.461 for 1 generation away and 0.396 for 3 generations away, with the same number of individuals in the reference set, using a GBLUP model with chip markers only. The results showed that using the Bayesian method and QTL markers improved the reliability of genomic prediction in all scenarios of relationship between test and reference animals, in a range of 1.3% and 65.1% (4 generations away with only 841 individuals in the reference set). However, most gains were for predictions of milk yield and fat yield. There was little improvement for predictions of protein yield and mastitis, and no improvement for prediction of fertility, except for scenario ρ < 0.1, in which there was a large improvement for predictions of all traits. On the other hand, models including more than 10% polygenic effect decreased prediction reliability when the relationship between test and reference animals was distant.  相似文献   

11.
The purpose of this study was to determine whether multi-country genomic evaluation can be accomplished by multiple-trait genomic best linear unbiased predictor (GBLUP) without sharing genotypes of important animals. Phenotypes and genotypes with 40k SNP were simulated for 25,000 animals, each with 4 traits assuming the same genetic variance and 0.8 genetic correlations. The population was split into 4 subpopulations corresponding to 4 countries, one for each trait. Additionally, a prediction population was created from genotyped animals that were not present in the individual countries but were related to each country's population. Genomic estimated breeding values were computed for each country and subsequently converted to SNP effects. Phenotypes were reconstructed for the prediction population based on the SNP effects of a country and the prediction animals' genotypes. The prediction population was used as the basis for the international evaluation, enabling bull comparisons without sharing genotypes and only sharing SNP effects. The computations were such that SNP effects computed within-country or in the prediction population were the same. Genomic estimated breeding values were calculated by single-trait GBLUP for within-country and multiple-trait GBLUP for multi-country predictions. The true accuracy for the prediction population with reconstructed phenotypes was at most 0.02 less than the accuracy with the original data. The differences increased when countries were assumed unequally sized. However, accuracies by multiple-trait GBLUP with the prediction population were always greater than accuracies from any single within-country prediction. Multi-country genomic evaluations by multiple-trait GBLUP are possible without using original genotypes at a cost of lower accuracy compared with explicitly combining countries' data.  相似文献   

12.
The accuracy of genomic prediction determines response to selection. It has been hypothesized that accuracy of genomic breeding values can be increased by a higher density of variants. We used imputed whole-genome sequence data and various single nucleotide polymorphism (SNP) selection criteria to estimate genomic breeding values in Brown Swiss cattle. The extreme scenarios were 50K SNP chip data and whole-genome sequence data with intermediate scenarios using linkage disequilibrium-pruned whole-genome sequence variants, only variants predicted to be missense, or the top 50K variants from genome-wide association studies. We estimated genomic breeding values for 3 traits (somatic cell score, nonreturn rate in heifers, and stature) and found differences in accuracy levels between traits. However, among different SNP sets, accuracy was very similar. In our analyses, sequence data led to a marginal increase in accuracy for 1 trait and was lower than 50K for the other traits. We concluded that the inclusion of imputed whole-genome sequence data does not lead to increased accuracy of genomic prediction with the methods.  相似文献   

13.
Single-step genomic BLUP (ssGBLUP) requires compatibility between genomic and pedigree relationships for unbiased and accurate predictions. Scaling the genomic relationship matrix (G) to have the same averages as the pedigree relationship matrix (i.e., scaling by averages) is one way to ensure compatibility. This requires computing both relationship matrices, calculating averages, and changing G, whereas only the inverses of those matrices are needed in the mixed model equations. Therefore, the compatibility process can add extra computing burden. In the single-step Bayesian regression, the scaling is done by including a mean (μg) as a fixed effect in the model. The parameter μg can be interpreted as the average of the breeding values of the genotyped animals. In this study, such scaling, called automatic, was implemented in ssGBLUP via Quaas-Pollak transformation of the inverse of the relationship matrix used in ssGBLUP (H), which combines the inverses of the pedigree and genomic relationship matrices. Comparisons involved a simulated data set, and the genomic relationship matrix was computed using different allele frequencies either from the current population (i.e., realized allele frequencies), equal among all the loci, or from the base population. For all of the scenarios, we computed bias [defined as the average difference between true breeding values (TBV) and genomic estimated breeding values (GEBV)], accuracy (defined as the correlation between TBV and GEBV), and dispersion (defined as the regression coefficient of GEBV on TBV). With no scaling, the bias expressed in terms of genetic standard deviations was 0.86, 0.64, and 0.58 with realized, equal, and base population allele frequencies, respectively. With scaling by averages, which is currently used in ssGBLUP, bias was 0.07, 0.08, and 0.03, respectively. With automatic scaling, bias was 0.18 regardless of allele frequencies. Accuracies were similar among scaling methods, but about 0.1 lower in the scenario without scaling. The GEBV were more inflated without any scaling, whereas the automatic scaling performed similarly to the scaling by averages. The average dispersion for those methods was 0.94. When μg was treated as random, with the variance equal to differences between pedigree and genomic relationships, the bias was the same as with the scaling by averages. The automatic scaling is biased, especially when μg is treated as a fixed effect. The bias may be small in real data with fewer generations, when traits are undergoing weak selection, or when the number of genotyped animals is large.  相似文献   

14.
Different approaches of calculating genomic measures of relationship were explored and compared with pedigree relationships (A) within and across base breeds in a crossbreed population, using genotypes for 38,194 loci of 4,106 Nordic Red dairy cattle. Four genomic relationship matrices (G) were calculated using either observed allele frequencies (AF) across breeds or within-breed AF. The G matrices were compared separately when the AF were estimated in the observed and in the base population. Breedwise AF in the current and base population were estimated using linear regression models of individual genotypes on breed composition. Different G matrices were further used to predict direct estimated genomic values using a genomic BLUP model. Higher variability existed in the diagonal elements of G across breeds (standard deviation = 0.06, on average) compared with A (0.01). The use of simple observed AF across base breeds to compute G increased coefficients for individuals in distantly related populations. Estimated breedwise AF reduced differences in coefficients similarly within and across populations. The variability of the current adjusted G matrix decreased from 0.055 to 0.035 when breedwise AF were estimated from the base breed population. The direct estimated genomic values and their validation reliabilities were, however, unaffected by AF used to compute G when estimated with a genomic BLUP model, due to inclusion of breed means in the model. In multibreed populations, G adjusted with breedwise AF from the founder population may provide more consistency among relationship coefficients between genotyped and ungenotyped individuals in an across-breed single-step evaluation.  相似文献   

15.
Genomic selection methodologies and genome-wide association studies use powerful statistical procedures that correlate large amounts of high-density SNP genotypes and phenotypic data. Actual 305-d milk (MY), fat (FY), and protein (PY) yield data on 695 cows and 76,355 genotyping-by-sequencing-generated SNP marker genotypes from Canadian Holstein dairy cows were used to characterize linkage disequilibrium (LD) structure of Canadian Holstein cows. Also, the comparison of pedigree-based BLUP, genomic BLUP (GBLUP), and Bayesian (BayesB) statistical methods in the genomic selection methodologies and the comparison of Bayesian ridge regression and BayesB statistical methods in the genome-wide association studies were carried out for MY, FY, and PY. Results from LD analysis revealed that as marker distance decreases, LD increases through chromosomes. However, unexpected high peaks in LD were observed between marker pairs with larger marker distances on all chromosomes. The GBLUP and BayesB models resulted in similar heritability estimates through 10-fold cross-validation for MY and PY; however, the GBLUP model resulted in higher heritability estimates than BayesB model for FY. The predictive ability of GBLUP model was significantly lower than that of BayesB for MY, FY, and PY. Association analyses indicated that 28 high-effect markers and markers on Bos taurus autosome 14 located within 6 genes (DOP1B, TONSL, CPSF1, ADCK5, PARP10, and GRINA) associated significantly with FY.  相似文献   

16.
It has been shown that single-step genomic BLUP (ssGBLUP) can be reformulated, resulting in an equivalent SNP model that includes the explicit imputation of gene contents of all ungenotyped animals in the pedigree. This reformulation reveals the underlying mechanism enabling ungenotyped animals to contribute information to genotyped animals via estimates of marker effects and consequently to the reliability of genomic predictions, a key feature generally associated with the single-step approach. Irrespective of which BLUP formulation is used for genomic prediction, with increasing numbers of genotyped animals, the marker-oriented model is recommended when calculating the reliabilities of genomic predictions. This approach has the advantage of a manageable and stable size of the model matrix that needs to be inverted to calculate analytical prediction error variances of marker effects, an advantage that also holds for prediction with the single-step model. However, when including imputed genotypes in the design matrix of marker effects, an additional imputation residual term has to be considered to account for the prediction error of imputation. We summarize some of the theoretical aspects associated with the calculation of analytical reliabilities of single-step predictions. Derivations are based on the equivalent reformulation of ssGBLUP as a marker-oriented model and the calculation of prediction error variances of marker effects. We propose 2 approximations that allow for a substantial reduction of the complexity of the matrix operations involved, while retaining most of the relevant information required for reliability calculations. We additionally provide a general framework for an implementation of single-step reliability approximation using standard animal model reliabilities as a starting point. Finally, we demonstrate the effectiveness of the proposed approach using a small example extracted from data of the routine evaluation on dual-purpose Fleckvieh (Simmental) cattle.  相似文献   

17.
The objective of this study was to predict genomic breeding values for milk yield of crossbred dairy cattle under different scenarios using single-step genomic BLUP (ssGBLUP). The data set included 13,880,217 milk yield measurements on 6,830,415 cows. Genotypes of 89,558 Holstein, 40,769 Jersey, and 22,373 Holstein-Jersey crossbred animals were used, of which all Holstein, 9,313 Jersey, and 1,667 crossbred animals had phenotypic records. Genotypes were imputed to 45K SNP markers. The SNP effects were estimated from single-breed evaluations for Jersey (JE), Holstein (HO) and crossbreds (CROSS), and multibreed evaluations including all Jersey and Holstein (JE_HO) or approximately equal proportions of Jersey, Holstein, and crossbred animals (MIX). Indirect predictions (IP) of the validation animals (358 crossbred animals with phenotypes excluded from evaluations) were calculated using the resulting SNP effects. Additionally, breed proportions (BP) of crossbred animals were applied as a weight when IP were estimated based on each pure breed. The predictive ability of IP was calculated as the Pearson correlation between IP and phenotypes of the validation animals adjusted for fixed effects in the model. Regression of adjusted phenotypes on IP was used to assess the inflation of IP. The predictive ability of IP for CROSS, JE, HO, JE_HO, and MIX scenario was 0.50, 0.50, 0.47, 0.50, and 0.46, respectively. Using BP was the least successful, with a predictive ability of 0.32. The inflation of the IP for crossbred animals using CROSS, JE, HO, JE_HO, MIX, and BP scenarios were 1.17, 0.65, 0.55, 0.78, 1.00, and 0.85, respectively. The IP of crossbred animals can be predicted using single-step GBLUP under a scenario that includes purebred genotypes.  相似文献   

18.
The objective was to compare methods of modeling missing pedigree in single-step genomic BLUP (ssGBLUP). Options for modeling missing pedigree included ignoring the missing pedigree, unknown parent groups (UPG) based on A (the numerator relationship matrix) or H (the unified pedigree and genomic relationship matrix), and metafounders. The assumptions for the distribution of estimated breeding values changed with the different models. We simulated data with heritabilities of 0.3 and 0.1 for dairy cattle populations that had more missing pedigrees for animals of lesser genetic merit. Predictions for the youngest generation and UPG solutions were compared with the true values for validation. For both traits, ssGBLUP with metafounders provided accurate and unbiased predictions for young animals while also appropriately accounting for genetic trend. Accuracy was least and bias was greatest for ssGBLUP with UPG for H for the trait with heritability of 0.3 and with UPG for A for the trait with heritability of 0.1. For the trait with heritability of 0.1 and UPG for H, the UPG accuracy (SD) was ?0.49 (0.12), suggesting poor estimates of genetic trend despite having little bias for validations on young, genotyped animals. Problems with UPG estimates were likely caused by the lesser amount of information available for the lower heritability trait. Hence, UPG need to be defined differently based on the trait and amount of information. More research is needed to investigate accounting for UPG in A22 to better account for missing pedigrees for genotyped animals.  相似文献   

19.
The objective of this study was to assess the reliability and bias of estimated breeding values (EBV) from traditional BLUP with unknown parent groups (UPG), genomic EBV (GEBV) from single-step genomic BLUP (ssGBLUP) with UPG for the pedigree relationship matrix (A) only (SS_UPG), and GEBV from ssGBLUP with UPG for both A and the relationship matrix among genotyped animals (A22; SS_UPG2) using 6 large phenotype-pedigree truncated Holstein data sets. The complete data included 80 million records for milk, fat, and protein yields from 31 million cows recorded since 1980. Phenotype-pedigree truncation scenarios included truncation of phenotypes for cows recorded before 1990 and 2000 combined with truncation of pedigree information after 2 or 3 ancestral generations. A total of 861,525 genotyped bulls with progeny and cows with phenotypic records were used in the analyses. Reliability and bias (inflation/deflation) of GEBV were obtained for 2,710 bulls based on deregressed proofs, and on 381,779 cows born after 2014 based on predictivity (adjusted cow phenotypes). The BLUP reliabilities for young bulls varied from 0.29 to 0.30 across traits and were unaffected by data truncation and number of generations in the pedigree. Reliabilities ranged from 0.54 to 0.69 for SS_UPG and were slightly affected by phenotype-pedigree truncation. Reliabilities ranged from 0.69 to 0.73 for SS_UPG2 and were unaffected by phenotype-pedigree truncation. The regression coefficient of bull deregressed proofs on (G)EBV (i.e., GEBV and EBV) ranged from 0.86 to 0.90 for BLUP, from 0.77 to 0.94 for SS_UPG, and was 1.00 ± 0.03 for SS_UPG2. Cow predictivity ranged from 0.22 to 0.28 for BLUP, 0.48 to 0.51 for SS_UPG, and 0.51 to 0.54 for SS_UPG2. The highest cow predictivities for BLUP were obtained with the most extreme truncation, whereas for SS_UPG2, cow predictivities were also unaffected by phenotype-pedigree truncations. The regression coefficient of cow predictivities on (G)EBV was 1.02 ± 0.02 for SS_UPG2 with the most extreme truncation, which indicated the least biased predictions. Computations with the complete data set took 17 h with BLUP, 58 h with SS_UPG, and 23 h with SS_UPG2. The same computations with the most extreme phenotype-pedigree truncation took 7, 36, and 15 h, respectively. The SS_UPG2 converged in fewer rounds than BLUP, whereas SS_UPG took up to twice as many rounds. Thus, the ssGBLUP with UPG assigned to both A and A22 provided accurate and unbiased evaluations, regardless of phenotype-pedigree truncation scenario. Old phenotypes (before 2000 in this data set) did not affect the reliability of predictions for young selection candidates, especially in SS_UPG2.  相似文献   

20.
This study investigated the efficiency of genomic prediction with adding the markers identified by genome-wide association study (GWAS) using a data set of imputed high-density (HD) markers from 54K markers in Chinese Holsteins. Among 3,056 Chinese Holsteins with imputed HD data, 2,401 individuals born before October 1, 2009, were used for GWAS and a reference population for genomic prediction, and the 220 younger cows were used as a validation population. In total, 1,403, 1,536, and 1,383 significant single nucleotide polymorphisms (SNP; false discovery rate at 0.05) associated with conformation final score, mammary system, and feet and legs were identified, respectively. About 2 to 3% genetic variance of 3 traits was explained by these significant SNP. Only a very small proportion of significant SNP identified by GWAS was included in the 54K marker panel. Three new marker sets (54K+) were herein produced by adding significant SNP obtained by linear mixed model for each trait into the 54K marker panel. Genomic breeding values were predicted using a Bayesian variable selection (BVS) model. The accuracies of genomic breeding value by BVS based on the 54K+ data were 2.0 to 5.2% higher than those based on the 54K data. The imputed HD markers yielded 1.4% higher accuracy on average (BVS) than the 54K data. Both the 54K+ and HD data generated lower bias of genomic prediction, and the 54K+ data yielded the lowest bias in all situations. Our results show that the imputed HD data were not very useful for improving the accuracy of genomic prediction and that adding the significant markers derived from the imputed HD marker panel could improve the accuracy of genomic prediction and decrease the bias of genomic prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号