首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Genomic prediction is widely used to select candidates for breeding. Size and composition of the reference population are important factors influencing prediction accuracy. In Holstein dairy cattle, large reference populations are used, but this is difficult to achieve in numerically small breeds and for traits that are not routinely recorded. The prediction accuracy is usually estimated using cross-validation, requiring the full data set. It would be useful to have a method to predict the benefit of multibreed reference populations that does not require the availability of the full data set. Our objective was to study the effect of the size and breed composition of the reference population on the accuracy of genomic prediction using genomic BLUP and Bayes R. We also examined the effect of trait heritability and validation breed on prediction accuracy. Using these empirical results, we investigated the use of a formula to predict the effect of the size and composition of the reference population on the accuracy of genomic prediction. Phenotypes were simulated in a data set containing real genotypes of imputed sequence variants for 22,752 dairy bulls and cows, including Holstein, Jersey, Red Holstein, and Australian Red cattle. Different reference populations were constructed, varying in size and composition, to study within-breed, multibreed, and across-breed prediction. Phenotypes were simulated varying in heritability, number of chromosomes, and number of quantitative trait loci. Genomic prediction was carried out using genomic BLUP and Bayes R. We used either the genomic relationship matrix (GRM) to estimate the number of independent chromosomal segments and subsequently to predict accuracy, or the accuracies obtained from single-breed reference populations to predict the accuracies of larger or multibreed reference populations. Using the GRM overestimated the accuracy; this overestimation was likely due to close relationships among some of the reference animals. Consequently, the GRM could not be used to predict the accuracy of genomic prediction reliably. However, a method using the prediction accuracies obtained by cross-validation using a small, single-breed reference population predicted the accuracy using a multibreed reference population well and slightly overestimated the accuracy for a larger reference population of the same breed, but gave a reasonably close estimate of the accuracy for a multibreed reference population. This method could be useful for making decisions regarding the size and composition of the reference population.  相似文献   

2.
The objective of this study was to investigate different strategies for genotype imputation in a population of crossbred Girolando (Gyr × Holstein) dairy cattle. The data set consisted of 478 Girolando, 583 Gyr, and 1,198 Holstein sires genotyped at high density with the Illumina BovineHD (Illumina, San Diego, CA) panel, which includes ~777K markers. The accuracy of imputation from low (20K) and medium densities (50K and 70K) to the HD panel density and from low to 50K density were investigated. Seven scenarios using different reference populations (RPop) considering Girolando, Gyr, and Holstein breeds separately or combinations of animals of these breeds were tested for imputing genotypes of 166 randomly chosen Girolando animals. The population genotype imputation were performed using FImpute. Imputation accuracy was measured as the correlation between observed and imputed genotypes (CORR) and also as the proportion of genotypes that were imputed correctly (CR). This is the first paper on imputation accuracy in a Girolando population. The sample-specific imputation accuracies ranged from 0.38 to 0.97 (CORR) and from 0.49 to 0.96 (CR) imputing from low and medium densities to HD, and 0.41 to 0.95 (CORR) and from 0.50 to 0.94 (CR) for imputation from 20K to 50K. The CORRanim exceeded 0.96 (for 50K and 70K panels) when only Girolando animals were included in RPop (S1). We found smaller CORRanim when Gyr (S2) was used instead of Holstein (S3) as RPop. The same behavior was observed between S4 (Gyr + Girolando) and S5 (Holstein + Girolando) because the target animals were more related to the Holstein population than to the Gyr population. The highest imputation accuracies were observed for scenarios including Girolando animals in the reference population, whereas using only Gyr animals resulted in low imputation accuracies, suggesting that the haplotypes segregating in the Girolando population had a greater effect on accuracy than the purebred haplotypes. All chromosomes had similar imputation accuracies (CORRsnp) within each scenario. Crossbred animals (Girolando) must be included in the reference population to provide the best imputation accuracies.  相似文献   

3.
To facilitate routine genomic evaluation, a database was constructed to store genotypes for 50,972 single nucleotide polymorphisms (SNP) from the Illumina BovineSNP50 BeadChip (Illumina Inc., San Diego, CA). Multiple samples per animal are allowed. All SNP genotypes for a sample are stored in a single row. An indicator specifies whether the genotype for a sample was selected for use in genomic evaluation. Samples with low call rates or pedigree conflicts are designated as unusable. Among multiple samples that qualify for use in genomic evaluation, the one with the highest call rate is designated as usable. When multiple samples are stored for an animal, a composite is formed during extraction by using SNP genotypes from other samples to replace missing genotypes. To increase the number of SNP available, scanner output for approximately 19,000 samples was reprocessed. Any SNP with a minor allele frequency of ≥1% for Holsteins, Jerseys, or Brown Swiss was selected, which was the primary reason that the number of SNP used for USDA genomic evaluations increased. Few parent-progeny conflicts (≤1%) and a high call rate (≥90%) were additional requirements that eliminated 2,378 SNP. Because monomorphic SNP did not degrade convergence during estimation of SNP effects, a single set of 43,385 SNP was adopted for all breeds. The use of a database for genotypes, detection of conflicts as genotypes are stored, online access for problem resolution, and use of a single set of SNP for genomic evaluations have simplified tracking of genotypes and genomic evaluation as a routine and official process.  相似文献   

4.
The genomic evaluation system in the United States: past, present, future   总被引:1,自引:0,他引:1  
Implementation of genomic evaluation has caused profound changes in dairy cattle breeding. All young bulls bought by major artificial insemination organizations now are selected based on such evaluation. Evaluation reliability can reach approximately 75% for yield traits, which is adequate for marketing semen of 2-yr-old bulls. Shortened generation interval from using genomic evaluations is the most important factor in increasing the rate of genetic improvement. Genomic evaluations are based on 42,503 single nucleotide polymorphisms (SNP) genotyped with technology that became available in 2007. The first unofficial USDA genomic evaluations were released in 2008 and became official for Holsteins, Jerseys, and Brown Swiss in 2009. Evaluation accuracy has increased steadily from including additional bulls with genotypes and traditional evaluations (predictor animals). Some of that increase occurs automatically as young genotyped bulls receive a progeny test evaluation at 5 yr of age. Cow contribution to evaluation accuracy is increased by decreasing mean and variance of their evaluations so that they are similar to bull evaluations. Integration of US and Canadian genotype databases was critical to achieving acceptable initial accuracy and continues to benefit both countries. Genotype exchange with other countries added predictor bulls for Brown Swiss. In 2010, a low-density chip with 2,900 SNP and a high-density chip with 777,962 SNP were released. The low-density chip has increased greatly the number of animals genotyped and is expected to replace microsatellites in parentage verification. The high-density chip can increase evaluation accuracy by better tracking of loci responsible for genetic differences. To integrate information from chips of various densities, a method to impute missing genotypes was developed based on splitting each genotype into its maternal and paternal haplotypes and tracing their inheritance through the pedigree. The same method is used to impute genotypes of nongenotyped dams based on genotyped progeny and mates. Reliability of resulting evaluations is discounted to reflect errors inherent in the process. Further increases in evaluation accuracy are expected because of added predictor animals and more SNP. The large population of existing genotypes can be used to evaluate new traits; however, phenotypic observations must be obtained for enough animals to allow estimation of SNP effects with sufficient accuracy for application to the general population.  相似文献   

5.
Genome-wide selection aims to predict genetic merit of individuals by estimating the effect of chromosome segments on phenotypes using dense single nucleotide polymorphism (SNP) marker maps. In the present paper, principal component analysis was used to reduce the number of predictors in the estimation of genomic breeding values for a simulated population. Principal component extraction was carried out either using all markers available or separately for each chromosome. Priors of predictor variance were based on their contribution to the total SNP correlation structure. The principal component approach yielded the same accuracy of predicted genomic breeding values obtained with the regression using SNP genotypes directly, with a reduction in the number of predictors of about 96% and computation time of 99%. Although these accuracies are lower than those currently achieved with Bayesian methods, at least for simulated data, the improved calculation speed together with the possibility of extracting principal components directly on individual chromosomes may represent an interesting option for predicting genomic breeding values in real data with a large number of SNP. The use of phenotypes as dependent variable instead of conventional breeding values resulted in more reliable estimates, thus supporting the current strategies adopted in research programs of genomic selection in livestock.  相似文献   

6.
With the introduction of new single nucleotide polymorphism (SNP) chips of various densities, more and more genotype data sets will include animals genotyped for only a subset of the SNP. Imputation techniques based on unobserved ancestral haplotypes may be used to infer missing genotypes. These ancestral haplotypes may also be used in the genomic prediction model, instead of using the SNP. This may increase the reliability of predictions because the ancestral haplotype may capture more linkage disequilibrium with quantitative trait loci than SNP. The aim of this paper was to study whether using unobserved ancestral haplotypes in a genomic prediction model would provide more reliable genomic predictions than using SNP, and to determine how many loci in the genomic prediction model would be redundant. Genotypes of 8,960 bulls and cows for 39,557 SNP were analyzed with a hidden Markov model to associate each individual at each locus to 2 ancestral haplotypes. The number of ancestral haplotypes per locus was fixed at 10, 15, or 20. Subsequently, a validation study was performed in which the phenotypes of 3,251 progeny-tested bulls for 16 traits were used in a genomic prediction model to predict the estimated breeding values of at least 753 validation bulls. The squared correlation between genomic prediction and deregressed daughter performance estimated breeding value, when averaged across traits, was slightly higher when 15 or 20 ancestral haplotypes per locus were used in the prediction model instead of the SNP genotypes, whereas the prediction model using a genomic relationship matrix gave the lowest squared correlations. The number of redundant loci [i.e., loci that had less than 18 jumps (0.1%) from one ancestral haplotype to another ancestral haplotype at the next locus], was 18,793 (48%), which means that only 20,764 loci would need to be included in the genomic prediction model. This provides opportunities for greatly decreasing computer requirements of genomic evaluations with very large numbers of markers.  相似文献   

7.
Genomic selection using dense markers covering the whole genome is a tool for the genetic improvement of livestock and is revolutionizing the breeding system in dairy cattle. Progeny-tested bulls have been used to form reference populations in almost all countries where genomic selection has been implemented. In this study, the accuracy of genomic prediction when cows are used to form the reference population was investigated. The reference population consisted of 3,087 cows. All individuals were genotyped with Illumina BovineSNP50. After genotype imputation and editing, 48,676 single nucleotide polymorphisms were available for analysis. Two methods, genomic BLUP (GBLUP) and BayesB, were used to render genomic estimated breeding values (GEBV) for 5 milk production traits. Accuracies of GEBV were assessed in 3 ways: rGEBV,EBV (the correlation between GEBV and conventional EBV) in 67 progeny-tested bulls, rGEBV,EBV from a 5-fold cross validation in the 3,087 cow reference population, and the theoretical accuracy (for GBLUP) calculated in the same way as for conventional BLUP. The results showed that using GBLUP, the rGEBV,EBV and theoretical accuracy of genomic prediction in Chinese Holstein ranged from 0.59 to 0.76 and 0.70 to 0.80, respectively, which was 0.13 to 0.30 and 0.23 to 0.33 higher than the accuracies of conventional pedigree index, respectively. The results indicate that, as an alternative, genomic selection using cows in the reference population is feasible.  相似文献   

8.
The availability of dense single nucleotide polymorphism (SNP) genotypes for dairy cattle has created exciting research opportunities and revolutionized practical breeding programs. Broader application of this technology will lead to situations in which genotypes from different low-, medium-, or high-density platforms must be combined. In this case, missing SNP genotypes can be imputed using family- or population-based algorithms. Our objective was to evaluate the accuracy of imputation in Jersey cattle, using reference panels comprising 2,542 animals with 43,385 SNP genotypes and study samples of 604 animals for which genotypes were available for 1, 2, 5, 10, 20, 40, or 80% of loci. Two population-based algorithms, fastPHASE 1.2 (P. Scheet and M. Stevens; University of Washington TechTransfer Digital Ventures Program, Seattle, WA) and IMPUTE 2.0 (B. Howie and J. Marchini; Department of Statistics, University of Oxford, UK), were used to impute genotypes on Bos taurus autosomes 1, 15, and 28. The mean proportion of genotypes imputed correctly ranged from 0.659 to 0.801 when 1 to 2% of genotypes were available in the study samples, from 0.733 to 0.964 when 5 to 20% of genotypes were available, and from 0.896 to 0.995 when 40 to 80% of genotypes were available. In the absence of pedigrees or genotypes of close relatives, the accuracy of imputation may be modest (generally <0.80) when low-density platforms with fewer than 1,000 SNP are used, but population-based algorithms can provide reasonably good accuracy (0.80 to 0.95) when medium-density platforms of 2,000 to 4,000 SNP are used in conjunction with high-density genotypes (e.g., >40,000 SNP) from a reference population. Accurate imputation of high-density genotypes from inexpensive low- or medium-density platforms could greatly enhance the efficiency of whole-genome selection programs in dairy cattle.  相似文献   

9.
The objectives of this study were to describe, using the goat SNP50 BeadChip (Illumina Inc., San Diego, CA), molecular data for the French dairy goat population and compare the effect of using genomic information on breeding value accuracy in different reference populations. Several multi-breed (Alpine and Saanen) reference population sizes, including or excluding female genotypes (from 67 males to 677 males, and 1,985 females), were used. Genomic evaluations were performed using genomic best linear unbiased predictor for milk production traits, somatic cell score, and some udder type traits. At a marker distance of 50 kb, the average r2 (squared correlation coefficient) value of linkage disequilibrium was 0.14, and persistence of linkage disequilibrium as correlation of r-values among Saanen and Alpine breeds was 0.56. Genomic evaluation accuracies obtained from cross validation ranged from 36 to 53%. Biases of these estimations assessed by regression coefficients (from 0.73 to 0.98) of phenotypes on genomic breeding values were higher for traits such as protein yield than for udder type traits. Using the reference population that included all males and females, accuracies of genomic breeding values derived from prediction error variances (model accuracy) obtained for young buck candidates without phenotypes ranged from 52 to 56%. This was lower than the average pedigree-derived breeding value accuracies obtained at birth for these males from the official genetic evaluation (62%). Adding females to the reference population of 677 males improved accuracy by 5 to 9% depending on the trait considered. Gains in model accuracies of genomic breeding values ranged from 1 to 7%, lower than reported in other studies. The gains in breeding value accuracy obtained using genomic information were not as good as expected because of the limited size (at most 677 males and 1,985 females) and the structure of the reference population.  相似文献   

10.
Methods for genomic prediction were evaluated for an Israeli Holstein dairy population of 713,686 cows and 1,305 progeny-tested bulls with genotypes. Inclusion of genotypes of 343 elite cows in an evaluation method that considers pedigree, phenotypes, and genotypes simultaneously was also evaluated. Two data sets were available: a complete data set with production records from 1985 through 2011, and a reduced data set with records after 2006 deleted. For each production trait, a multitrait animal model was used to compute traditional genetic evaluations for parities 1 through 3 as separate traits. Evaluations were calculated for the reduced and complete data sets. The evaluations from the reduced data set were used to calculate parent average for validation bulls, which was the benchmark for comparing gain in predictive ability from genomics. Genomic predictions for bulls in 2006 were calculated using a Bayesian regression method (BayesC), genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP), and weighted ssGBLUP (WssGBLUP). Predictions using BayesC and GBLUP were calculated either with or without an index that included parent average. Genomic predictions that included elite cow genotypes were calculated using ssGBLUP and WssGBLUP. Predictive ability was assessed by coefficients of determination (R2) and regressions of predictions of 135 validation bulls with no daughters in 2006 on deregressed evaluations of those bulls in 2011. A reduction in R2 and regression coefficients was observed from parities 1 through 3. Fat and protein yields had the lowest R2 for all the methods. On average, R2 was lowest for parent averages, followed by GBLUP, BayesC, ssGBLUP, and WssGBLUP. For some traits, R2 for direct genomic values from BayesC and GBLUP were lower than those for parent averages. Genomic estimated breeding values using ssGBLUP were the least biased, and this method appears to be a suitable tool for genomic evaluation of a small genotyped population, as it automatically accounts for parental index, allows for inclusion of female genomic information without preadjustments in evaluations, and uses the same model as in traditional evaluations. Weighted ssGBLUP has the potential for higher evaluation accuracy.  相似文献   

11.
Genomic selection using 50,000 single nucleotide polymorphism (50k SNP) chips has been implemented in many dairy cattle breeding programs. Cheap, low-density chips make genotyping of a larger number of animals cost effective. A commonly proposed strategy is to impute low-density genotypes up to 50,000 genotypes before predicting direct genomic values (DGV). The objectives of this study were to investigate the accuracy of imputation for animals genotyped with a low-density chip and to investigate the effect of imputation on reliability of DGV. Low-density chips contained 384, 3,000, or 6,000 SNP. The SNP were selected based either on the highest minor allele frequency in a bin or the middle SNP in a bin, and DAGPHASE, CHROMIBD, and multivariate BLUP were used for imputation. Genotypes of 9,378 animals were used, from which approximately 2,350 animals had deregressed proofs. Bayesian stochastic search variable selection was used for estimating SNP effects of the 50k chip. Imputation accuracies and imputation error rates were poor for low-density chips with 384 SNP. Imputation accuracies were higher with 3,000 and 6,000 SNP. Performance of DAGPHASE and CHROMIBD was very similar and much better than that of multivariate BLUP for both imputation accuracy and reliability of DGV. With 3,000 SNP and using CHROMIBD or DAGPHASE for imputation, 84 to 90% of the increase in DGV reliability using the 50k chip, compared with a pedigree index, was obtained. With multivariate BLUP, the increase in reliability was only 40%. With 384 SNP, the reliability of DGV was lower than for a pedigree index, whereas with 6,000 SNP, about 93% of the increase in reliability of DGV based on the 50k chip was obtained when using DAGPHASE for imputation. Using genotype probabilities to predict gene content increased imputation accuracy and the reliability of DGV and is therefore recommended for applications of imputation for genomic prediction. A deterministic equation was derived to predict accuracy of DGV based on imputation accuracy, which fitted closely with the observed relationship. The deterministic equation can be used to evaluate the effect of differences in imputation accuracy on accuracy and reliability of DGV.  相似文献   

12.
Compared with the currently widely used multi-step genomic models for genomic evaluation, single-step genomic models can provide more accurate genomic evaluation by jointly analyzing phenotypes and genotypes of all animals and can properly correct for the effect of genomic preselection on genetic evaluations. The objectives of this study were to introduce a single-step genomic model, allowing a direct estimation of single nucleotide polymorphism (SNP) effects, and to develop efficient computing algorithms for solving equations of the single-step SNP model. We proposed an alternative to the current single-step genomic model based on the genomic relationship matrix by including an additional step for estimating the effects of SNP markers. Our single-step SNP model allowed flexible modeling of SNP effects in terms of the number and variance of SNP markers. Moreover, our single-step SNP model included a residual polygenic effect with trait-specific variance for reducing inflation in genomic prediction. A kernel calculation of the SNP model involved repeated multiplications of the inverse of the pedigree relationship matrix of genotyped animals with a vector, for which numerical methods such as preconditioned conjugate gradients can be used. For estimating SNP effects, a special updating algorithm was proposed to separate residual polygenic effects from the SNP effects. We extended our single-step SNP model to general multiple-trait cases. By taking advantage of a block-diagonal (co)variance matrix of SNP effects, we showed how to estimate multivariate SNP effects in an efficient way. A general prediction formula was derived for candidates without phenotypes, which can be used for frequent, interim genomic evaluations without running the whole genomic evaluation process. We discussed various issues related to implementation of the single-step SNP model in Holstein populations with an across-country genomic reference population.  相似文献   

13.
《Journal of dairy science》2022,105(4):3306-3322
Genomic evaluation based on a single-step model uses all available data of phenotype, genotype, and pedigree; therefore, it should provide unbiased genomic breeding values with a higher correlation of prediction than the current multistep genomic model. Since 2019, a mixed reference population of cows and bulls has been applied to the routine multistep genomic evaluation in German Holsteins. For a fair comparison between the single-step and multistep genomic models, the same phenotype, genotype, and pedigree data were used. Because of its simple structure of the standard multitrait animal model used for German Holstein conventional evaluation, conformation traits were chosen as the first trait group to test a single-step SNP BLUP model for the large, genotyped population of German Holsteins. Genotype, phenotype, and pedigree data were taken from the official August 2020 conventional and genomic evaluation. Because of the same trait definition in national and multiple across-country evaluation for the conformation traits, deregressed multiple across-country evaluation estimated breeding value (EBV) of foreign bulls were treated as a new source of data for the same trait in the genomic evaluations. Due to a short history of female genotyping in Germany, the last 3 yr of youngest cows and bulls were deleted, instead of 4 yr, to perform a genomic validation. In comparison to the multistep genomic model, the single-step SNP BLUP model resulted in a higher correlation and greater variance of genomic EBV according to 798 national validation bulls. The regression of genomic prediction of the current, full evaluation on the earlier, truncated evaluation was slightly closer to 1 than the multistep model. For the validation bulls or youngest genomic artificial insemination bulls, correlation of genomic EBV between the 2 models was, on average, 0.95 across all the conformation traits. We did not find overprediction of young animals by the single-step SNP BLUP model for the conformation traits in German Holsteins.  相似文献   

14.
The first national single-step, full-information (phenotype, pedigree, and marker genotype) genetic evaluation was developed for final score of US Holsteins. Data included final scores recorded from 1955 to 2009 for 6,232,548 Holsteins cows. BovineSNP50 (Illumina, San Diego, CA) genotypes from the Cooperative Dairy DNA Repository (Beltsville, MD) were available for 6,508 bulls. Three analyses used a repeatability animal model as currently used for the national US evaluation. The first 2 analyses used final scores recorded up to 2004. The first analysis used only a pedigree-based relationship matrix. The second analysis used a relationship matrix based on both pedigree and genomic information (single-step approach). The third analysis used the complete data set and only the pedigree-based relationship matrix. The fourth analysis used predictions from the first analysis (final scores up to 2004 and only a pedigree-based relationship matrix) and prediction using a genomic based matrix to obtain genetic evaluation (multiple-step approach). Different allele frequencies were tested in construction of the genomic relationship matrix. Coefficients of determination between predictions of young bulls from parent average, single-step, and multiple-step approaches and their 2009 daughter deviations were 0.24, 0.37 to 0.41, and 0.40, respectively. The highest coefficient of determination for a single-step approach was observed when using a genomic relationship matrix with assumed allele frequencies of 0.5. Coefficients for regression of 2009 daughter deviations on parent-average, single-step, and multiple-step predictions were 0.76, 0.68 to 0.79, and 0.86, respectively, which indicated some inflation of predictions. The single-step regression coefficient could be increased up to 0.92 by scaling differences between the genomic and pedigree-based relationship matrices with little loss in accuracy of prediction. One complete evaluation took about 2 h of computing time and 2.7 gigabytes of memory. Computing times for single-step analyses were slightly longer (2%) than for pedigree-based analysis. A national single-step genetic evaluation with the pedigree relationship matrix augmented with genomic information provided genomic predictions with accuracy and bias comparable to multiple-step procedures and could account for any population or data structure. Advantages of single-step evaluations should increase in the future when animals are pre-selected on genotypes.  相似文献   

15.
Milkability is a trait related to the milking efficiency of an animal, and it is a component of the herd profitability. Due to its economic importance, milkability is currently included in the selection index of the Italian Simmental cattle breed with a weight of 7.5%. This lowly heritable trait is measured on a subjective scale from 1 to 3 (1 = slow, 3 = fast), and genetic evaluations are performed by pedigree-based BLUP. Genomic information is now available for some animals in the Italian Simmental population, and its inclusion in the genetic evaluation system could increase accuracy of breeding values and genetic progress for milkability. The aim of this study was to test the feasibility and advantages of having a genomic evaluation for this trait in the Italian Simmental population. Phenotypes were available for 131,308 cows. A total of 9,526 animals had genotypes for 42,152 loci; among the genotyped animals, 2,455 were cows with phenotypes, and the other were their relatives. The youngest cows with both phenotypes and genotypes (n = 900) were identified as selection candidates. Variance components and heritability were estimated using pedigree information, whereas genetic and genomic evaluations were carried out using BLUP and single-step genomic BLUP (ssGBLUP), respectively. In addition, a weighted ssGBLUP was assessed using genomic regions from a genome-wide association study. Evaluation models were validated using theoretical and realized accuracies. The estimated heritability for milkability was 0.12 ± 0.01. The mean theoretical accuracies for selection candidates were 0.43 ± 0.08 (BLUP) and 0.53 ± 0.06 (ssGBLUP). The mean realized accuracies based on linear regression statistics were 0.29 (BLUP) and 0.40 (ssGBLUP). No genomic regions were significantly associated with milkability, thus no improvements in accuracy were observed when using weighted ssGBLUP. Results indicated that genomic information could improve the accuracy of breeding values and increase genetic progress for milkability in Italian Simmental.  相似文献   

16.
Cost-effective high-density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low-density panel, and then imputing the missing genotypes that are not directly assayed in the low-density panel. The efficacy of genotype imputation can largely be affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations, and 3 imputation algorithms. We found that Minimac and a reference population, which included a mixture of crossbred and ancestral purebred animals, provided the highest imputation accuracy compared with other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNP, were around 0.76 and 0.94 for 7K and 40K SNP, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low-density panels, which relies on the pairwise (co)variances between SNP and the minor allele frequency of SNP. The performance of the developed method was tested in a 5-fold cross-validation process where various densities of SNP were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNP. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNP. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.  相似文献   

17.
Given the interest of including dry matter intake (DMI) in the breeding goal, accurate estimated breeding values (EBV) for DMI are needed, preferably for separate lactations. Due to the limited amount of records available on DMI, 2 main approaches have been suggested to compute those EBV: (1) the inclusion of predictor traits, such as fat- and protein-corrected milk (FPCM) and live weight (LW), and (2) the addition of genomic information of animals using what is called genomic prediction. Recently, several methodologies to estimate EBV utilizing genomic information (EBV) have become available. In this study, a new method known as single-step ridge-regression BLUP (SSRR-BLUP) is suggested. The SSRR-BLUP method does not have an imposed limit on the number of genotyped animals, as the commonly used methods do. The objective of this study was to estimate genetic parameters using a relatively large data set with DMI records, as well as compare the accuracies of the EBV for DMI. These accuracies were obtained using 4 different methods: BLUP (using pedigree for all animals with phenotypes), genomic BLUP (GBLUP; only for genotyped animals), single-step GBLUP (SS-GBLUP), and SSRR-BLUP (for genotyped and nongenotyped animals). Records from different lactations, with or without predictor traits (FPCM and LW), were used in the model. Accuracies of EBV for DMI (defined as the correlation between the EBV and pre-adjusted DMI phenotypes divided by the average accuracy of those phenotypes) ranged between 0.21 and 0.38 across methods and scenarios. Accuracies of EBV for DMI using BLUP were the lowest accuracies obtained across methods. Meanwhile, accuracies of EBV for DMI were similar in SS-GBLUP and SSRR-BLUP, and lower for the GBLUP method. Hence, SSRR-BLUP could be used when the number of genotyped animals is large, avoiding the construction of the inverse genomic relationship matrix. Adding information on DMI from different lactations in the reference population gave higher accuracies in comparison when only lactation 1 was included. Finally, no benefit was obtained by adding information on predictor traits to the reference population when DMI was already included. However, in the absence of DMI records, having records on FPCM and LW from different lactations is a good way to obtain EBV with a relatively good accuracy.  相似文献   

18.
Various models have been used for genomic prediction. Bayesian variable selection models often predict more accurate genomic breeding values than genomic BLUP (GBLUP), but GBLUP is generally preferred for routine genomic evaluations because of low computational demand. The objective of this study was to achieve the benefits of both models using results from Bayesian models and genome-wide association studies as weights on single nucleotide polymorphism (SNP) markers when constructing the genomic matrix (G-matrix) for genomic prediction. The data comprised 5,221 progeny-tested bulls from the Nordic Holstein population. The animals were genotyped using the Illumina Bovine SNP50 BeadChip (Illumina Inc., San Diego, CA). Weighting factors in this investigation were the posterior SNP variance, the square of the posterior SNP effect, and the corresponding minus base-10 logarithm of the marker association P-value [−log10(P)] of a t-test obtained from the analysis using a Bayesian mixture model with 4 normal distributions, the square of the estimated SNP effect, and the corresponding −log10(P) of a t-test obtained from the analysis using a classical genome-wide association study model (linear regression model). The weights were derived from the analysis based on data sets that were 0, 1, 3, or 5 yr before performing genomic prediction. In building a G-matrix, the weights were assigned either to each marker (single-marker weighting) or to each group of approximately 5 to 150 markers (group-marker weighting). The analysis was carried out for milk yield, fat yield, protein yield, fertility, and mastitis. Deregressed proofs (DRP) were used as response variables to predict genomic estimated breeding values (GEBV). Averaging over the 5 traits, the Bayesian model led to 2.0% higher reliability of GEBV than the GBLUP model with an original unweighted G-matrix. The superiority of using a GBLUP with weighted G-matrix over GBLUP with an original unweighted G-matrix was the largest when using a weighting factor of posterior variance, resulting in 1.7 percentage points higher reliability. The second best weighting factors were −log10 (P-value) of a t-test corresponding to the square of the posterior SNP effect from the Bayesian model and −log10 (P-value) of a t-test corresponding to the square of the estimated SNP effect from the linear regression model, followed by the square of estimated SNP effect and the square of the posterior SNP effect. In addition, group-marker weighting performed better than single-marker weighting in terms of reducing bias of GEBV, and also slightly increased prediction reliability. The differences between weighting factors and scenarios were larger in prediction bias than in prediction accuracy. Finally, weights derived from a data set having a lag up to 3 yr did not reduce reliability of GEBV. The results indicate that posterior SNP variance estimated from a Bayesian mixture model is a good alternative weighting factor, and common weights on group markers with a size of 30 markers is a good strategy when using markers of the 50,000-marker (50K) chip. In a population with gradually increasing reference data, the weights can be updated once every 3 yr.  相似文献   

19.
This study investigated the efficiency of genomic prediction with adding the markers identified by genome-wide association study (GWAS) using a data set of imputed high-density (HD) markers from 54K markers in Chinese Holsteins. Among 3,056 Chinese Holsteins with imputed HD data, 2,401 individuals born before October 1, 2009, were used for GWAS and a reference population for genomic prediction, and the 220 younger cows were used as a validation population. In total, 1,403, 1,536, and 1,383 significant single nucleotide polymorphisms (SNP; false discovery rate at 0.05) associated with conformation final score, mammary system, and feet and legs were identified, respectively. About 2 to 3% genetic variance of 3 traits was explained by these significant SNP. Only a very small proportion of significant SNP identified by GWAS was included in the 54K marker panel. Three new marker sets (54K+) were herein produced by adding significant SNP obtained by linear mixed model for each trait into the 54K marker panel. Genomic breeding values were predicted using a Bayesian variable selection (BVS) model. The accuracies of genomic breeding value by BVS based on the 54K+ data were 2.0 to 5.2% higher than those based on the 54K data. The imputed HD markers yielded 1.4% higher accuracy on average (BVS) than the 54K data. Both the 54K+ and HD data generated lower bias of genomic prediction, and the 54K+ data yielded the lowest bias in all situations. Our results show that the imputed HD data were not very useful for improving the accuracy of genomic prediction and that adding the significant markers derived from the imputed HD marker panel could improve the accuracy of genomic prediction and decrease the bias of genomic prediction.  相似文献   

20.
The objective of the present study was to assess the predictive ability of subsets of single nucleotide polymorphism (SNP) markers for development of low-cost, low-density genotyping assays in dairy cattle. Dense SNP genotypes of 4,703 Holstein bulls were provided by the USDA Agricultural Research Service. A subset of 3,305 bulls born from 1952 to 1998 was used to fit various models (training set), and a subset of 1,398 bulls born from 1999 to 2002 was used to evaluate their predictive ability (testing set). After editing, data included genotypes for 32,518 SNP and August 2003 and April 2008 predicted transmitting abilities (PTA) for lifetime net merit (LNM$), the latter resulting from progeny testing. The Bayesian least absolute shrinkage and selection operator method was used to regress August 2003 PTA on marker covariates in the training set to arrive at estimates of marker effects and direct genomic PTA. The coefficient of determination (R2) from regressing the April 2008 progeny test PTA of bulls in the testing set on their August 2003 direct genomic PTA was 0.375. Subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP were created by choosing equally spaced and highly ranked SNP, with the latter based on the absolute value of their estimated effects obtained from the training set. The SNP effects were re-estimated from the training set for each subset of SNP, and the 2008 progeny test PTA of bulls in the testing set were regressed on corresponding direct genomic PTA. The R2 values for subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP with largest effects (evenly spaced SNP) were 0.184 (0.064), 0.236 (0.111), 0.269 (0.190), 0.289 (0.179), 0.307 (0.228), 0.313 (0.268), and 0.322 (0.291), respectively. These results indicate that a low-density assay comprising selected SNP could be a cost-effective alternative for selection decisions and that significant gains in predictive ability may be achieved by increasing the number of SNP allocated to such an assay from 300 or fewer to 1,000 or more.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号