首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The enhanced availability of sequence data in livestock provides an opportunity for more accurate predictions in routine genomic evaluations. Such evaluations would therefore no longer rely only on the linkage disequilibrium between a chip marker and the causal mutation. The objective of this study was to assess the usefulness of sequence data in Saanen goats (n = 33) to better capture a quantitative trait locus (QTL) on chromosome 19 (CHI19) and improve the accuracy of predictions for 3 milk production traits, 5 type traits, and somatic cell scores. All 1,207 50K genotypes were imputed to the sequence level. Four scenarios, each using a subset of CHI19 imputed variants, were then tested. Sequence-derived information included all CHI19 variants (529,576), all variants in the QTL region (22,269), 178 variants selected in the QTL region and added to an updated chip, or 178 randomly selected variants on CHI19. Two genomic evaluation models were applied: single-step genomic BLUP and weighted single-step genomic BLUP. All scenarios were compared with single-step genomic BLUP using 50K genotypes. Best overall results were obtained using single-step genomic BLUP on 50K genotypes completed with all variants in the QTL region of chromosome 19 (6.2% average increase in accuracy for 9 traits) with the highest accuracy gain for fat yield (17.9%), significant increases for milk (13.7%) and protein yields (12.5%), and type traits associated with CHI19. Despite its association with the QTL region of chromosome 19, the somatic cell score showed decreased accuracy in every alternative scenario. Using all CHI19 variants led to an overall decrease of 4.8% in prediction accuracy. The updated chip was efficient and improved genomic evaluations by 3.1 to 6.4% on average, depending on the scenario. Indeed, information from only a few carefully selected variants increased accuracies for traits of interest when used in a single-step genomic BLUP model. In conclusion, using QTL region variants imputed from sequence data in single-step genomic evaluations represents a promising perspective for such evaluations in dairy goats. Furthermore, using only a limited number of selected variants in QTL regions, as available on SNP chip updates, significantly increases the accuracy for QTL-associated traits without deteriorating the evaluation accuracy for other traits. The latter approach is interesting, as it avoids time-consuming imputation and data formatting processes and provides reliable genotypes.  相似文献   

2.
This study investigated the efficiency of genomic prediction with adding the markers identified by genome-wide association study (GWAS) using a data set of imputed high-density (HD) markers from 54K markers in Chinese Holsteins. Among 3,056 Chinese Holsteins with imputed HD data, 2,401 individuals born before October 1, 2009, were used for GWAS and a reference population for genomic prediction, and the 220 younger cows were used as a validation population. In total, 1,403, 1,536, and 1,383 significant single nucleotide polymorphisms (SNP; false discovery rate at 0.05) associated with conformation final score, mammary system, and feet and legs were identified, respectively. About 2 to 3% genetic variance of 3 traits was explained by these significant SNP. Only a very small proportion of significant SNP identified by GWAS was included in the 54K marker panel. Three new marker sets (54K+) were herein produced by adding significant SNP obtained by linear mixed model for each trait into the 54K marker panel. Genomic breeding values were predicted using a Bayesian variable selection (BVS) model. The accuracies of genomic breeding value by BVS based on the 54K+ data were 2.0 to 5.2% higher than those based on the 54K data. The imputed HD markers yielded 1.4% higher accuracy on average (BVS) than the 54K data. Both the 54K+ and HD data generated lower bias of genomic prediction, and the 54K+ data yielded the lowest bias in all situations. Our results show that the imputed HD data were not very useful for improving the accuracy of genomic prediction and that adding the significant markers derived from the imputed HD marker panel could improve the accuracy of genomic prediction and decrease the bias of genomic prediction.  相似文献   

3.
《Journal of dairy science》2023,106(5):3345-3358
Genetic evaluations of local cattle breeds are hampered due to small reference groups or biased due to the utilization of SNP effects estimated in other large populations. Against this background, there is a lack of studies addressing the possible advantage of whole-genome sequences (WGS) or consideration of specific variants from WGS data in genomic predictions for local breeds with small population size. Consequently, the aim of this study was to compare genetic parameters and accuracies of genomic estimated breeding values (GEBV) for 305-d production traits, fat-to protein ratio (FPR), and somatic cell score (SCS) at the first test date after calving and confirmation traits of the endangered German Black Pied cattle (DSN) breed using 4 different marker panels: (1) the commercial 50K Illumina BovineSNP50 BeadChip, (2) a customized 200K chip designed for DSN (DSN200K) which considers the most important variants for DSN from WGS, (3) randomly generated 200K chips based on WGS data, and (4) a WGS panel. The same number of animals was considered for all marker panel analyses (i.e., 1,811 genotyped or sequenced cows for conformation traits, 2,383 cows for lactation production traits, and 2,420 cows for FPR and SCS). Mixed models for the estimation of genetic parameters directly included the respective genomic relationship matrix from the different marker panels plus the trait-specific fixed effects. For the calculation of GEBV accuracies, we applied repeated random subsampling validation. In the process of separate cross-validations per trait, we created a validation set including 20% of cows with masked phenotypes, and a training set comprising 80% of the cows. The cows were selected randomly in a procedure with 10 replicates considering replacements in the different scenarios. The accuracy was defined as the correlation between the direct GEBV and the phenotypes with subtracted corresponding fixed effects for the cows in the validation set. For FPR and SCS, as well as for lactation production traits, heritabilities were largest based on WGS data, but the increase compared with the 50K or DSN200K applications was quite small in the range from 0.01 to 0.03. Also, for most of the conformation traits, heritabilities were largest based on WGS and DSN200K data, but the increase was in the range of the corresponding standard error. Accordingly, GEBV accuracies for most of the studied traits were highest based on WGS data or when utilizing the DSN200K chip, but the accuracy differences across the marker panels were quite small and nonsignificant. In conclusion, WGS data and the DSN200K chip only contributed to minor improvements in genomic predictions, still justifying the use of the commercial 50K chip. Nevertheless, WGS and the 200KDSN chip harbor breed-specific variants, which are valuable for studying causal genetic mechanisms in the endangered DSN population.  相似文献   

4.
《Journal of dairy science》2022,105(6):5206-5220
As part of the From'MIR project, traits related to the composition and cheese-making properties (CMP) of milk were predicted from 6.6 million mid-infrared spectra taken from 410,622 Montbéliarde cows (19,862 with genotypes). Genome-wide association studies of imputed whole-genome sequences highlighted candidate SNPs that were then added to the EuroG10K BeadChip, which is routinely used in genomic selection. In the present study, we (1) assessed the reliability of single-step genomic BLUP breeding values (ssEBVs) for cheese yields, coagulation traits, and casein and calcium content generated from test-day records of the first 3 lactations, (2) estimated realized genetic trends for these traits over the last decade, and (3) simulated different cheese-making breeding objectives and estimated the responses for CMP as well as for other traits currently selected in the Montbéliarde breed. To estimate the reliability of ssEBVs, the available data were split into 2 independent training and validation sets that respectively contained cows with the oldest and the most recent lactation data. The training set included 155,961 cows (12,850 with genotypes) and was used to predict ssEBVs of 2,125 genotyped cows in the validation set. We first tested 4 models that included either lactation (LACT) or test-day (TD) records from the first (1) or the first 3 (3) lactations, giving equal weight to all 50K SNP effects. Mean reliabilities were 61%, 62%, 63%, and 64% for the LACT1, LACT3, TD1, and TD3 models, respectively. Using the most accurate model (TD3), we then compared the reliabilities of 3 scenarios with: SNPs from the Illumina BovineSNP50 BeadChip only, equally weighted (50K); 50K SNPs plus additional candidate SNPs, equally weighted (50K+); and 50K and candidate SNPs with additional weight given to 7 to 14 candidate SNPs, depending on the trait (CAND). The 50K+ and CAND scenarios led to similar mean reliabilities (67%) and both outperformed the 50K scenario (64%), whereas the CAND scenario generated the less biased ssEBVs. To assess genetic trends, SNP effects were estimated with a single-step GBLUP based on the TD3 model and the 50K scenario applied to the whole population (2.6 million performance records from 190,261 cows and 423,348 animals in the pedigree, of which 21,874 were genotyped) and then applied to 50K genotypes of 21,171 males and 311,761 females. We detected a positive genetic trend for all CMP during the last decade, probably due to selection for an increase in milk protein and fat content in Montbéliarde cows. Finally, we compared the selection responses to 3 different breeding objectives: the current Montbéliarde total merit index (TMI) and 2 alternative scenarios that gave a weight of 70% to TMI and the remaining 30% to either milk casein content (TMI-COMP) or a combination of 3 CMP (TMI-Cheese). The TMI-Cheese scenario yielded the best responses for all the CMP analyzed, whereas values in the TMI-COMP scenario were intermediate, with a slight effect on other traits currently included in TMI. Based on these results, a program of genomic evaluation for CMP predicted from mid-infrared spectra was designed and implemented for the Montbéliarde breed.  相似文献   

5.
Genotype imputation, often focused on SNP and small insertions and deletions (indels; size ≤50 bp), is a crucial step for association mapping and estimation of genomic breeding values. Here, we present strategies to impute genotypes for large chromosomal deletions (size >50 bp), along with SNP and indels in cattle. The pipelines include a strategy for extending the whole-genome sequence reference panel for large deletions, a 2-step genotype refinement approach using Beagle4 and SHAPEIT2 software, and finally, joint imputation of SNP, indels, and large deletions to the existing SNP array-typed population using Minimac3 software. Using these pipelines we achieved an imputation accuracy of the squared Pearson correlation (r2) > 0.6 at minor allele frequencies as low as 0.7% for SNP and indels, and 0.2% for large deletions. This highlights the potential of our approach to build a haplotype reference panel and impute different classes of sequence variants across a wide allele frequency spectrum with high accuracy.  相似文献   

6.
Identification of the genetic variants associated with calf survival in dairy cattle will aid in the elimination of harmful mutations from the cattle population and the reduction of calf and young stock mortality rates. We used de-regressed estimated breeding values for the young stock survival (YSS) index as response variables in a genome-wide association study with imputed whole-genome sequence variants. A total of 4,610 bulls with estimated breeding values were genotyped with the Illumina BovineSNP50 (Illumina, San Diego, CA) single nucleotide polymorphism (SNP) genotyping array. Genotypes were imputed to whole-genome sequence variants. After quality control, 15,419,550 SNP on 29 Bos taurus autosomes (BTA) were used for association analysis. A modified mixed-model association analysis was used for a genome scan, followed by a linear mixed-model analysis for selected genetic variants. We identified 498 SNP on BTA5 and BTA18 that were associated with the YSS index in Nordic Holstein. The SNP rs440345507 (Chr5:94721790) on BTA5 was the putative causal mutation affecting YSS. Two haplotype-based models were used to identify haplotypes with the largest detrimental effects on YSS index. For each association signal, 1 haplotype region with harmful effects and the lead associated SNP were identified. Detected haplotypes on BTA5 and BTA18 explained 1.16 and 1.20%, respectively, of genetic variance for the YSS index. We examined whether YSS quantitative trait loci (QTL) on BTA5 and BTA18 were associated with stillbirth. YSS QTL on BTA18 overlapped a QTL region for stillbirth, but most likely 2 different causal variants were responsible for these 2 QTL. Four component traits of the YSS index, defined by sex and age, were analyzed separately by the modified mixed-model approach. The same genomic regions were associated with both bull and heifer calf mortality. Several genes (EPS8, LOC100138951, and KLK family genes) contained a lead associated SNP or were included in haplotypes with large detrimental effects on YSS in Nordic Holstein cattle.  相似文献   

7.
With the availability of single nucleotide polymorphism (SNP) marker chips, such as the Illumina BovineSNP50 BeadChip (50K), genomic evaluation has been routinely implemented in dairy cattle breeding. However, for an average dairy producer, total costs associated with the 50K chip are still too high to have all the cows genotyped and genomically evaluated. To study the accuracy of cheaper low-density chips, genotypes were simulated for 2 low-density chips, the Illumina Bovine3K BeadChip (3K) and BovineLD BeadChip (6K), according to their original marker maps. Simulated missing genotypes of the 50K chip were imputed using the programs Beagle and Findhap. Three genotype data sets were used to study imputation accuracy: the EuroGenomics data set, with 14,405 reference bulls (data set I); the smaller EuroGenomics data set, with 11,670 older reference bulls (data set II); and the data set of all genotyped German Holsteins, with 31,597 reference animals (data set III). Imputed genotypes were compared with their original ones to calculate allele error rate for validation animals in the 3 data sets. To evaluate the loss in accuracy of genomic prediction when using imputed genotypes, a genomic evaluation was conducted only for EuroGenomics data set II. Furthermore, combined genome-enhanced breeding values calculated from the original and imputed genotypes were compared. Allele error rate for EuroGenomics data set II was highest for the Findhap program on the 3K chip (3.3%) and lowest for the Beagle program on the 6K chip (0.6%). Across the data sets, Beagle was shown to be about 2 times as accurate as Findhap. Compared with the real 50K genotypes, the reduction in reliability of the genomic prediction when using the imputed genotypes was highest for Findhap on the 3K chip (5.3%) and lowest for Beagle on the 6K chip (1%) when averaged over the 12 evaluated traits. Differences in genome-enhanced breeding values of the original and imputed genotypes were largest for Findhap on the 3K chip, whereas Beagle on the 6K chip had the smallest difference. The low-density chip, 6K, gave markedly higher imputation accuracy and more accurate genomic prediction than the 3K chip. On the basis of the relatively small reduction in accuracy of genomic prediction, we would recommend the BovineLD 6K chip for large-scale genotyping as long as its costs are acceptable to breeders.  相似文献   

8.
《Journal of dairy science》2022,105(3):2426-2438
This study investigated the reliability of genomic prediction (GP) using breed origin of alleles (BOA) approach in the Nordic Red (RDC) population, which has an admixed population structure. The RDC population consists of animals with varying degrees of genetic materials from the Danish Red (RDM), Swedish Red (SRB), Finnish Ayrshire (FAY), and Holstein (HOL) because bulls have been used across the breeds. The BOA approach was tested using 39,550 RDC animals in the reference population and 11,786 in the validation population. Deregressed proofs (DRP) of milk, fat and protein were used as response variable for GP. Direct genomic breeding values (DGV) for animals in the validation population were calculated with (BOA model) or without (joint model) considering breed origin of alleles. The joint model assumed homogeneous marker effects and a single set of marker effects were estimated, whereas BOA model assumed heterogeneous marker effects, and different sets of marker effects were estimated across the breeds. For the BOA approach, we tested scenarios assuming both correlated (BOA_cor) and uncorrelated (BOA_uncor) marker effects between the breeds. Additionally, we investigated GP using a standard Illumina 50K chip and including SNP selected from imputed whole-genome sequencing (50K+WGS). We also studied the effect of estimating (co)variances for genome regions of different sizes to exploit the information of the genome regions contributing to the (co)variance between the breeds. Region sizes were set as 1 SNP, a group of 30 or 100 adjacent SNP, or the whole genome. Reliability of DGV was measured as squared correlations between DGV and DRP divided by the reliability of DRP. Across the 3 traits, in general, RS30 and RS100 SNP yielded the highest reliabilities. Including WGS SNP improved reliabilities in almost all scenarios (0.297 on average for 50K and 0.307 on average for 50K+WGS). The BOA_uncor (0.233 on average) was inferior to the joint model (0.339 on average), but the reliabilities obtained using BOA_cor (0.334 on average) in most cases were not significantly different from those obtained using the joint model. The results indicate that both including additional whole-genome sequencing SNP and dividing the genome into fixed regions improve GP in the RDC. The BOA models have the potential to increase the reliability of GP, but the benefit is limited in populations with a high exchange of genetic material for a long time, as is the case for RDC.  相似文献   

9.
《Journal of dairy science》2017,100(7):5479-5490
Genomic selection may accelerate genetic progress in breeding programs of indicine breeds when compared with traditional selection methods. We present results of genomic predictions in Gyr (Bos indicus) dairy cattle of Brazil for milk yield (MY), fat yield (FY), protein yield (PY), and age at first calving using information from bulls and cows. Four different single nucleotide polymorphism (SNP) chips were studied. Additionally, the effect of the use of imputed data on genomic prediction accuracy was studied. A total of 474 bulls and 1,688 cows were genotyped with the Illumina BovineHD (HD; San Diego, CA) and BovineSNP50 (50K) chip, respectively. Genotypes of cows were imputed to HD using FImpute v2.2. After quality check of data, 496,606 markers remained. The HD markers present on the GeneSeek SGGP-20Ki (15,727; Lincoln, NE), 50K (22,152), and GeneSeek GGP-75Ki (65,018) were subset and used to assess the effect of lower SNP density on accuracy of prediction. Deregressed breeding values were used as pseudophenotypes for model training. Data were split into reference and validation to mimic a forward prediction scheme. The reference population consisted of animals whose birth year was ≤2004 and consisted of either only bulls (TR1) or a combination of bulls and dams (TR2), whereas the validation set consisted of younger bulls (born after 2004). Genomic BLUP was used to estimate genomic breeding values (GEBV) and reliability of GEBV (R2PEV) was based on the prediction error variance approach. Reliability of GEBV ranged from ∼0.46 (FY and PY) to 0.56 (MY) with TR1 and from 0.51 (PY) to 0.65 (MY) with TR2. When averaged across all traits, R2PEV were substantially higher (R2PEV of TR1 = 0.50 and TR2 = 0.57) compared with reliabilities of parent averages (0.35) computed from pedigree data and based on diagonals of the coefficient matrix (prediction error variance approach). Reliability was similar for all the 4 marker panels using either TR1 or TR2, except that imputed HD cow data set led to an inflation of reliability. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information. A reduced panel of ∼15K markers resulted in reliabilities similar to using HD markers. Reliability of GEBV could be increased by enlarging the limited bull reference population with cow information.  相似文献   

10.
The construction and use of haploblocks [adjacent single nucleotide polymorphisms (SNP) in strong linkage disequilibrium] for genomic evaluation is advantageous, because the number of effects to be estimated can be reduced without discarding relevant genomic information. Furthermore, haplotypes (the combination of 2 or more SNP) can increase the probability of capturing the quantitative trait loci effect compared with individual SNP markers. With regards to haplotypes, the allele frequency parameter is also of interest, because as a selection criterion, it allows the number of rare alleles to be reduced, and the effects of those alleles are usually difficult to estimate. We have proposed a simple pipeline that simultaneously incorporates linkage disequilibrium and allele frequency information in genomic evaluation, and here we present the first results obtained with this procedure. We used a population of 2,235 progeny-tested bulls from the Montbéliarde breed for the tests. Phenotype data were available in the form of daughter yield deviations on 5 production traits, and genotype data were available from the 50K SNP chip. We conducted a classical validation study by splitting the population into training (80% oldest animals) and validation (20% youngest animals) sets to emulate a real-life scenario in which the selection candidates had no available phenotype data. We measured all reported parameters for the validation set. Our results proved that the proposed method was indeed advantageous, and that the accuracy of genomic evaluation could be improved. Compared with results from a genomic BLUP analysis, correlations between daughter yield deviations (a proxy for true) and genomic estimated breeding values increased by an average of 2.7 percentage points for the 5 traits. Inflation of the genomic evaluation of the selection candidates was also significantly reduced. The proposed method outperformed the other SNP and haplotype-based tests we had evaluated in a previous study. The combination of linkage disequilibrium–based haploblocks and allele frequency–based haplotype selection methods is a promising way to improve the efficiency of genomic evaluation. Further work is needed to optimize each step in the proposed analysis pipeline.  相似文献   

11.
《Journal of dairy science》2019,102(8):7237-7247
Relatedness between reference and test animals has an important effect on the reliability of genomic prediction for test animals. Because genomic prediction has been widely applied in practical cattle breeding and bulls have been selected according to genomic breeding value without progeny testing, the sires or grandsires of candidates might not have phenotypic information and might not be in the reference population when the candidates are selected. The objective of this study was to investigate the decreasing trend of the reliability of genomic prediction given distant reference populations, using genomic best linear unbiased prediction (GBLUP) and Bayesian variable selection models with or without including the quantitative trait locus (QTL) markers detected from sequencing data. The data used in this study consisted of 22,242 bulls genotyped using the 54K SNP array from EuroGenomics. Among them, 1,444 Danish bulls born from 2006 to 2010 were selected as test animals. Different reference populations with varying relationships to test animals were created according to pedigree-based relationships. The reference individuals having a relationship with one or more test animals higher than 0.4 (scenario ρ < 0.4), 0.2 (ρ < 0.2), or 0.1 (ρ < 0.1, where ρ = relationship coefficient) were removed from reference sets; these represented the distance between reference and test animals being 2 generations, 3 generations, and 4 generations, respectively. Imputed whole-genome sequencing data of bulls from Denmark were used to conduct a genome-wide association study (GWAS). A small number of significant variants (QTL markers) from the GWAS were added to the array data. To compare the effects of different models, the basic GBLUP model, a Bayesian selection variable model, a GBLUP model with 2 components of genetic effects, and a Bayesian model with pooled array data and QTL markers were used for estimating genomic estimated breeding values (GEBV) of test animals. The reliability of genomic prediction decreased when the test animals were more generations away from the reference population. The reliability of genomic prediction was 0.461 for 1 generation away and 0.396 for 3 generations away, with the same number of individuals in the reference set, using a GBLUP model with chip markers only. The results showed that using the Bayesian method and QTL markers improved the reliability of genomic prediction in all scenarios of relationship between test and reference animals, in a range of 1.3% and 65.1% (4 generations away with only 841 individuals in the reference set). However, most gains were for predictions of milk yield and fat yield. There was little improvement for predictions of protein yield and mastitis, and no improvement for prediction of fertility, except for scenario ρ < 0.1, in which there was a large improvement for predictions of all traits. On the other hand, models including more than 10% polygenic effect decreased prediction reliability when the relationship between test and reference animals was distant.  相似文献   

12.
The purpose of this study was to determine whether multi-country genomic evaluation can be accomplished by multiple-trait genomic best linear unbiased predictor (GBLUP) without sharing genotypes of important animals. Phenotypes and genotypes with 40k SNP were simulated for 25,000 animals, each with 4 traits assuming the same genetic variance and 0.8 genetic correlations. The population was split into 4 subpopulations corresponding to 4 countries, one for each trait. Additionally, a prediction population was created from genotyped animals that were not present in the individual countries but were related to each country's population. Genomic estimated breeding values were computed for each country and subsequently converted to SNP effects. Phenotypes were reconstructed for the prediction population based on the SNP effects of a country and the prediction animals' genotypes. The prediction population was used as the basis for the international evaluation, enabling bull comparisons without sharing genotypes and only sharing SNP effects. The computations were such that SNP effects computed within-country or in the prediction population were the same. Genomic estimated breeding values were calculated by single-trait GBLUP for within-country and multiple-trait GBLUP for multi-country predictions. The true accuracy for the prediction population with reconstructed phenotypes was at most 0.02 less than the accuracy with the original data. The differences increased when countries were assumed unequally sized. However, accuracies by multiple-trait GBLUP with the prediction population were always greater than accuracies from any single within-country prediction. Multi-country genomic evaluations by multiple-trait GBLUP are possible without using original genotypes at a cost of lower accuracy compared with explicitly combining countries' data.  相似文献   

13.
This study investigated the reliability of genomic estimated breeding values (GEBV) in the Danish Holstein population. The data in the analysis included 3,330 bulls with both published conventional EBV and single nucleotide polymorphism (SNP) markers. After data editing, 38,134 SNP markers were available. In the analysis, all SNP were fitted simultaneously as random effects in a Bayesian variable selection model, which allows heterogeneous variances for different SNP markers. The response variables were the official EBV. Direct GEBV were calculated as the sum of individual SNP effects. Initial analyses of 4 index traits were carried out to compare models with different intensities of shrinkage for SNP effects; that is, mixture prior distributions of scaling factors (standard deviation of SNP effects) assuming 5, 10, 20, or 50% of SNP having large effects and the others having very small or no effects, and a single prior distribution common for all SNP. It was found that, in general, the model with a common prior distribution of scaling factors had better predictive ability than any mixture prior models. Therefore, a common prior model was used to estimate SNP effects and breeding values for all 18 index traits. Reliability of GEBV was assessed by squared correlation between GEBV and conventional EBV (r2GEBV, EBV), and expected reliability was obtained from prediction error variance using a 5-fold cross validation. Squared correlations between GEBV and published EBV (without any adjustment) ranged from 0.252 to 0.700, with an average of 0.418. Expected reliabilities ranged from 0.494 to 0.733, with an average of 0.546. Averaged over 18 traits, r2GEBV, EBV was 0.13 higher and expected reliability was 0.26 higher than reliability of conventional parent average. The results indicate that genomic selection can greatly improve the accuracy of preselection for young bulls compared with traditional selection based on parent average information.  相似文献   

14.
Cost-effective high-density (HD) genotypes of livestock species can be obtained by genotyping a proportion of the population using a HD panel and the remainder using a cheaper low-density panel, and then imputing the missing genotypes that are not directly assayed in the low-density panel. The efficacy of genotype imputation can largely be affected by the structure and history of the specific target population and it should be checked before incorporating imputation in routine genotyping practices. Here, we investigated the efficacy of imputation in crossbred dairy cattle populations of East Africa using 4 different commercial single nucleotide polymorphisms (SNP) panels, 3 reference populations, and 3 imputation algorithms. We found that Minimac and a reference population, which included a mixture of crossbred and ancestral purebred animals, provided the highest imputation accuracy compared with other scenarios of imputation. The accuracies of imputation, measured as the correlation between real and imputed genotypes averaged across SNP, were around 0.76 and 0.94 for 7K and 40K SNP, respectively, when imputed up to a 770K panel. We also presented a method to maximize the imputation accuracy of low-density panels, which relies on the pairwise (co)variances between SNP and the minor allele frequency of SNP. The performance of the developed method was tested in a 5-fold cross-validation process where various densities of SNP were selected using the (co)variance method and also by alternative SNP selection methods and then imputed up to the HD panel. The (co)variance method provided the highest imputation accuracies at almost all marker densities, with accuracies being up to 0.19 higher than the random selection of SNP. The accuracies of imputation from 7K and 40K panels selected using the (co)variance method were around 0.80 and 0.94, respectively. The presented method also achieved higher accuracy of genomic prediction at lower densities of selected SNP. The squared correlation between genomic breeding values estimated using imputed genotypes and those from the real 770K HD panel was 0.95 when the accuracy of imputation was 0.64. The presented method for SNP selection is straightforward in its application and can ensure high accuracies in genotype imputation of crossbred dairy populations in East Africa.  相似文献   

15.
Various models have been used for genomic prediction. Bayesian variable selection models often predict more accurate genomic breeding values than genomic BLUP (GBLUP), but GBLUP is generally preferred for routine genomic evaluations because of low computational demand. The objective of this study was to achieve the benefits of both models using results from Bayesian models and genome-wide association studies as weights on single nucleotide polymorphism (SNP) markers when constructing the genomic matrix (G-matrix) for genomic prediction. The data comprised 5,221 progeny-tested bulls from the Nordic Holstein population. The animals were genotyped using the Illumina Bovine SNP50 BeadChip (Illumina Inc., San Diego, CA). Weighting factors in this investigation were the posterior SNP variance, the square of the posterior SNP effect, and the corresponding minus base-10 logarithm of the marker association P-value [−log10(P)] of a t-test obtained from the analysis using a Bayesian mixture model with 4 normal distributions, the square of the estimated SNP effect, and the corresponding −log10(P) of a t-test obtained from the analysis using a classical genome-wide association study model (linear regression model). The weights were derived from the analysis based on data sets that were 0, 1, 3, or 5 yr before performing genomic prediction. In building a G-matrix, the weights were assigned either to each marker (single-marker weighting) or to each group of approximately 5 to 150 markers (group-marker weighting). The analysis was carried out for milk yield, fat yield, protein yield, fertility, and mastitis. Deregressed proofs (DRP) were used as response variables to predict genomic estimated breeding values (GEBV). Averaging over the 5 traits, the Bayesian model led to 2.0% higher reliability of GEBV than the GBLUP model with an original unweighted G-matrix. The superiority of using a GBLUP with weighted G-matrix over GBLUP with an original unweighted G-matrix was the largest when using a weighting factor of posterior variance, resulting in 1.7 percentage points higher reliability. The second best weighting factors were −log10 (P-value) of a t-test corresponding to the square of the posterior SNP effect from the Bayesian model and −log10 (P-value) of a t-test corresponding to the square of the estimated SNP effect from the linear regression model, followed by the square of estimated SNP effect and the square of the posterior SNP effect. In addition, group-marker weighting performed better than single-marker weighting in terms of reducing bias of GEBV, and also slightly increased prediction reliability. The differences between weighting factors and scenarios were larger in prediction bias than in prediction accuracy. Finally, weights derived from a data set having a lag up to 3 yr did not reduce reliability of GEBV. The results indicate that posterior SNP variance estimated from a Bayesian mixture model is a good alternative weighting factor, and common weights on group markers with a size of 30 markers is a good strategy when using markers of the 50,000-marker (50K) chip. In a population with gradually increasing reference data, the weights can be updated once every 3 yr.  相似文献   

16.
Achieving accurate genomic estimated breeding values for dairy cattle requires a very large reference population of genotyped and phenotyped individuals. Assembling such reference populations has been achieved for breeds such as Holstein, but is challenging for breeds with fewer individuals. An alternative is to use a multi-breed reference population, such that smaller breeds gain some advantage in accuracy of genomic estimated breeding values (GEBV) from information from larger breeds. However, this requires that marker-quantitative trait loci associations persist across breeds. Here, we assessed the gain in accuracy of GEBV in Jersey cattle as a result of using a combined Holstein and Jersey reference population, with either 39,745 or 624,213 single nucleotide polymorphism (SNP) markers. The surrogate used for accuracy was the correlation of GEBV with daughter trait deviations in a validation population. Two methods were used to predict breeding values, either a genomic BLUP (GBLUP_mod), or a new method, BayesR, which used a mixture of normal distributions as the prior for SNP effects, including one distribution that set SNP effects to zero. The GBLUP_mod method scaled both the genomic relationship matrix and the additive relationship matrix to a base at the time the breeds diverged, and regressed the genomic relationship matrix to account for sampling errors in estimating relationship coefficients due to a finite number of markers, before combining the 2 matrices. Although these modifications did result in less biased breeding values for Jerseys compared with an unmodified genomic relationship matrix, BayesR gave the highest accuracies of GEBV for the 3 traits investigated (milk yield, fat yield, and protein yield), with an average increase in accuracy compared with GBLUP_mod across the 3 traits of 0.05 for both Jerseys and Holsteins. The advantage was limited for either Jerseys or Holsteins in using 624,213 SNP rather than 39,745 SNP (0.01 for Holsteins and 0.03 for Jerseys, averaged across traits). Even this limited and nonsignificant advantage was only observed when BayesR was used. An alternative panel, which extracted the SNP in the transcribed part of the bovine genome from the 624,213 SNP panel (to give 58,532 SNP), performed better, with an increase in accuracy of 0.03 for Jerseys across traits. This panel captures much of the increased genomic content of the 624,213 SNP panel, with the advantage of a greatly reduced number of SNP effects to estimate. Taken together, using this panel, a combined breed reference and using BayesR rather than GBLUP_mod increased the accuracy of GEBV in Jerseys from 0.43 to 0.52, averaged across the 3 traits.  相似文献   

17.
The genomic prediction of unobserved genetic values or future phenotypes for complex traits has revolutionized agriculture and human medicine. Fertility traits are undoubtedly complex traits of great economic importance to the dairy industry. Although genomic prediction for improved cow fertility has received much attention, bull fertility largely has been ignored. The first aim of this study was to investigate the feasibility of genomic prediction of sire conception rate (SCR) in US Holstein dairy cattle. Standard genomic prediction often ignores any available information about functional features of the genome, although it is believed that such information can yield more accurate and more persistent predictions. Hence, the second objective was to incorporate prior biological information into predictive models and evaluate their performance. The analyses included the use of kernel-based models fitting either all single nucleotide polymorphisms (SNP; 55K) or only markers with presumed functional roles, such as SNP linked to Gene Ontology or Medical Subject Heading terms related to male fertility, or SNP significantly associated with SCR. Both single- and multikernel models were evaluated using linear and Gaussian kernels. Predictive ability was evaluated in 5-fold cross-validation. The entire set of SNP exhibited predictive correlations around 0.35. Neither Gene Ontology nor Medical Subject Heading gene sets achieved predictive abilities higher than their counterparts using random sets of SNP. Notably, kernel models fitting significant SNP achieved the best performance with increases in accuracy up to 5% compared with the standard whole-genome approach. Models fitting Gaussian kernels outperformed their counterparts fitting linear kernels irrespective of the set of SNP. Overall, our findings suggest that genomic prediction of bull fertility is feasible in dairy cattle. This provides potential for accurate genome-guided decisions, such as early culling of bull calves with low SCR predictions. In addition, exploiting nonlinear effects through the use of Gaussian kernels together with the incorporation of relevant markers seems to be a promising alternative to the standard approach. The inclusion of gene set results into prediction models deserves further research.  相似文献   

18.
The objectives of this study were to describe, using the goat SNP50 BeadChip (Illumina Inc., San Diego, CA), molecular data for the French dairy goat population and compare the effect of using genomic information on breeding value accuracy in different reference populations. Several multi-breed (Alpine and Saanen) reference population sizes, including or excluding female genotypes (from 67 males to 677 males, and 1,985 females), were used. Genomic evaluations were performed using genomic best linear unbiased predictor for milk production traits, somatic cell score, and some udder type traits. At a marker distance of 50 kb, the average r2 (squared correlation coefficient) value of linkage disequilibrium was 0.14, and persistence of linkage disequilibrium as correlation of r-values among Saanen and Alpine breeds was 0.56. Genomic evaluation accuracies obtained from cross validation ranged from 36 to 53%. Biases of these estimations assessed by regression coefficients (from 0.73 to 0.98) of phenotypes on genomic breeding values were higher for traits such as protein yield than for udder type traits. Using the reference population that included all males and females, accuracies of genomic breeding values derived from prediction error variances (model accuracy) obtained for young buck candidates without phenotypes ranged from 52 to 56%. This was lower than the average pedigree-derived breeding value accuracies obtained at birth for these males from the official genetic evaluation (62%). Adding females to the reference population of 677 males improved accuracy by 5 to 9% depending on the trait considered. Gains in model accuracies of genomic breeding values ranged from 1 to 7%, lower than reported in other studies. The gains in breeding value accuracy obtained using genomic information were not as good as expected because of the limited size (at most 677 males and 1,985 females) and the structure of the reference population.  相似文献   

19.
In this study, direct genomic values for the functional traits general temperament, milking temperament, aggressiveness, rank order in herd, milking speed, udder depth, position of labia, and days to first heat in Brown Swiss dairy cattle were estimated based on ~777,000 (777K) single nucleotide polymorphism (SNP) information from 1,126 animals. Accuracy of direct genomic values was assessed by a 5-fold cross-validation with 10 replicates. Correlations between deregressed proofs and direct genomic values were 0.63 for general temperament, 0.73 for milking temperament, 0.69 for aggressiveness, 0.65 for rank order in herd, 0.69 for milking speed, 0.71 for udder depth, 0.66 for position of labia, and 0.74 for days to first heat. Using the information of ~54,000 (54K) SNP led to only marginal deviations in the observed accuracy. Trying to predict the 20% youngest bulls led to correlations of 0.55, 0.77, 0.73, 0.55, 0.64, 0.59, 0.67, and 0.77, respectively, for the traits listed above. Using a novel method to estimate the accuracy of a direct genomic value (defined as correlation between direct genomic value and true breeding value and accounting for the correlation between direct genomic values and conventional breeding values) revealed accuracies of 0.37, 0.20, 0.19, 0.27, 0.48, 0.45, 0.36, and 0.12, respectively, for the traits listed above. These values are much smaller but probably also more realistic than accuracies based on correlations, given the heritabilities and samples sizes in this study. Annotation of the largest estimated SNP effects revealed 2 candidate genes affecting the traits general temperament and days to first heat.  相似文献   

20.
The aim of this study was to evaluate different-density genotyping panels for genotype imputation and genomic prediction. Genotypes from customized Golden Gate Bovine3K BeadChip [LD3K; low-density (LD) 3,000-marker (3K); Illumina Inc., San Diego, CA] and BovineLD BeadChip [LD6K; 6,000-marker (6K); Illumina Inc.] panels were imputed to the BovineSNP50v2 BeadChip [50K; 50,000-marker; Illumina Inc.]. In addition, LD3K, LD6K, and 50K genotypes were imputed to a BovineHD BeadChip [HD; high-density 800,000-marker (800K) panel], and with predictive ability evaluated and compared subsequently. Comparisons of prediction accuracy were carried out using Random boosting and genomic BLUP. Four traits under selection in the Spanish Holstein population were used: milk yield, fat percentage (FP), somatic cell count, and days open (DO). Training sets at 50K density for imputation and prediction included 1,632 genotypes. Testing sets for imputation from LD to 50K contained 834 genotypes and testing sets for genomic evaluation included 383 bulls. The reference population genotyped at HD included 192 bulls. Imputation using BEAGLE software (http://faculty.washington.edu/browning/beagle/beagle.html) was effective for reconstruction of dense 50K and HD genotypes, even when a small reference population was used, with 98.3% of SNP correctly imputed. Random boosting outperformed genomic BLUP in terms of prediction reliability, mean squared error, and selection effectiveness of top animals in the case of FP. For other traits, however, no clear differences existed between methods. No differences were found between imputed LD and 50K genotypes, whereas evaluation of genotypes imputed to HD was on average across data set, method, and trait, 4% more accurate than 50K prediction, and showed smaller (2%) mean squared error of predictions. Similar bias in regression coefficients was found across data sets but regressions were 0.32 units closer to unity for DO when genotypes were imputed to HD density. Imputation to HD genotypes might produce higher stability in the genomic proofs of young candidates. Regarding selection effectiveness of top animals, more (2%) top bulls were classified correctly with imputed LD6K genotypes than with LD3K. When the original 50K genotypes were used, correct classification of top bulls increased by 1%, and when those genotypes were imputed to HD, 3% more top bulls were detected. Selection effectiveness could be slightly enhanced for certain traits such as FP, somatic cell count, or DO when genotypes are imputed to HD. Genetic evaluation units may consider a trait-dependent strategy in terms of method and genotype density for use in the genome-enhanced evaluations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号