The effect of phenotyping, adult selection, and mating strategies on genetic gain and rate of inbreeding in black soldier fly breeding programs
The aim of this study was to compare genetic gain and rate of inbreeding for different mass selection breeding programs with the aim of increasing larval body weight (LBW) in black soldier flies. The breeding programs differed in: (1) sampling of individuals for phenotyping (either random over the whole population or a fixed number per full sib family), (2) selection of adult flies for breeding (based on an adult individual's phenotype for LBW or random from larvae preselected based on LBW), and (3) mating strategy (mating in a group with unequal male contributions or controlled between two females and one male). In addition, the numbers of phenotyped and preselected larvae were varied. The sex of an individual was unknown during preselection and females had higher LBW, resulting in more females being preselected.
A computationally efficient algorithm to leverage average information REML for (co)variance component estimation in the genomic era
Methods for estimating variance components (VC) using restricted maximum likelihood (REML) typically require elements from the inverse of the coefficient matrix of the mixed model equations (MME). As genomic information becomes more prevalent, the coefficient matrix of the MME becomes denser, presenting a challenge for analyzing large datasets. Thus, computational algorithms based on iterative solving and Monte Carlo approximation of the inverse of the coefficient matrix become appealing. While the standard average information REML (AI-REML) is known for its rapid convergence, its computational intensity imposes limitations. In particular, the standard AI-REML requires solving the MME for each VC, which can be computationally demanding, especially when dealing with complex models with many VC. To bridge this gap, here we (1) present a computationally efficient and tractable algorithm, named the augmented AI-REML, which facilitates the AI-REML by solving an augmented MME only once within each REML iteration; and (2) implement this approach for VC estimation in a general framework of a multi-trait GBLUP model. VC estimation was investigated based on the number of VC in the model, including a two-trait, three-trait, four-trait, and five-trait GBLUP model. We compared the augmented AI-REML with the standard AI-REML in terms of computing time per REML iteration. Direct and iterative solving methods were used to assess the advances of the augmented AI-REML.
Empirical versus estimated accuracy of imputation: optimising filtering thresholds for sequence imputation
Genotype imputation is a cost-effective method for obtaining sequence genotypes for downstream analyses such as genome-wide association studies (GWAS). However, low imputation accuracy can increase the risk of false positives, so it is important to pre-filter data or at least assess the potential limitations due to imputation accuracy. In this study, we benchmarked three different imputation programs (Beagle 5.2, Minimac4 and IMPUTE5) and compared the empirical accuracy of imputation with the software estimated accuracy of imputation (Rsq). We also tested the accuracy of imputation in cattle for autosomal and X chromosomes, SNP and INDEL, when imputing from either low-density or high-density genotypes.
On the ability of the LR method to detect bias when there is pedigree misspecification and lack of connectedness
Cross-validation techniques in genetic evaluations encounter limitations due to the unobservable nature of breeding values and the challenge of validating estimated breeding values (EBVs) against pre-corrected phenotypes, challenges which the Linear Regression (LR) method addresses as an alternative. Furthermore, beef cattle genetic evaluation programs confront challenges with connectedness among herds and pedigree errors. The objective of this work was to evaluate the LR method's performance under pedigree errors and weak connectedness typical in beef cattle genetic evaluations, through simulation.
Investigating genotype by environment interaction for beef cattle fertility traits in commercial herds in northern Australia with multi-trait analysis
Genotype by environment interactions (GxE) affect a range of production traits in beef cattle. Quantifying the effect of GxE in commercial and multi-breed herds is challenging due to unknown genetic linkage between animals across environment levels. The primary aim of this study was to use multi-trait models to investigate GxE for three heifer fertility traits, corpus luteum (CL) presence, first pregnancy and second pregnancy, in a large tropical beef multibreed dataset (n = 21,037). Environmental levels were defined by two different descriptors, burden of heat load (temperature humidity index, THI) and nutritional availability (based on mean average daily gain for the herd, ADWG). To separate the effects of genetic linkage and real GxE across the environments, 1000 replicates of a simulated phenotype were generated by simulating QTL effects with no GxE onto real marker genotypes from the population, to determine the genetic correlations that could be expected across environments due to the existing genetic linkage only. Correlations from the real phenotypes were then compared to the empirical distribution under the null hypothesis from the simulated data. By adopting this approach, this study attempted to establish if low genetic correlations between environmental levels were due to GxE or insufficient genetic linkage between animals in each environmental level.
Population structure and breed identification of Chinese indigenous sheep breeds using whole genome SNPs and InDels
Accurate breed identification is essential for the conservation and sustainable use of indigenous farm animal genetic resources. In this study, we evaluated the phylogenetic relationships and genomic breed compositions of 13 sheep breeds using SNP and InDel data from whole genome sequencing. The breeds included 11 Chinese indigenous and 2 foreign commercial breeds. We compared different strategies for breed identification with respect to different marker types, i.e. SNPs, InDels, and a combination of SNPs and InDels (named SIs), different breed-informative marker detection methods, and different machine learning classification methods.
Genetic parameters and genotype-by-environment interaction estimates for growth and feed efficiency related traits in Chinook salmon, Oncorhynchus tshawytscha, reared under low and moderate flow regimes
A genotype-by-environment (G × E) interaction is defined as genotypes responding differently to different environments. In salmonids, G × E interactions can occur in different rearing conditions, including changes in salinity or temperature. However, water flow, an important variable that can influence metabolism, has yet to be considered for potential G × E interactions, although water flows differ across production stages. The salmonid industry is now manipulating flow in tanks to improve welfare and production performance, and expanding sea pen farming offshore, where flow dynamics are substantially greater. Therefore, there is a need to test whether G × E interactions occur under low and higher flow regimes to determine if industry should consider modifying their performance evaluation and selection criteria to account for different flow environments. Here, we used genotype-by-sequencing to create a genomic-relationship matrix of 37 Chinook salmon, Oncorhynchus tshawytscha, families to assess possible G × E interactions for production performance under two flow environments: a low flow regime (0.3 body lengths per second; bl s) and a moderate flow regime (0.8 bl s).
Mitochondrial sequence variants: testing imputation accuracy and their association with dairy cattle milk traits
Mitochondrial genomes differ from the nuclear genome and in humans it is known that mitochondrial variants contribute to genetic disorders. Prior to genomics, some livestock studies assessed the role of the mitochondrial genome but these were limited and inconclusive. Modern genome sequencing provides an opportunity to re-evaluate the potential impact of mitochondrial variation on livestock traits. This study first evaluated the empirical accuracy of mitochondrial sequence imputation and then used real and imputed mitochondrial sequence genotypes to study the role of mitochondrial variants on milk production traits of dairy cattle.
Genotyping both live and dead animals to improve post-weaning survival of pigs in breeding programs
In this study, we tested whether genotyping both live and dead animals (GSD) realises more genetic gain for post-weaning survival (PWS) in pigs compared to genotyping only live animals (GOS).
Identification of genomic regions associated with fatty acid metabolism across blood, liver, backfat and muscle in pigs
The composition and distribution of fatty acids (FA) are important factors determining the quality, flavor, and nutrient value of meat. In addition, FAs synthesized in the body participate in energy metabolism and are involved in different regulatory pathways in the form of signaling molecules or by acting as agonist or antagonist ligands of different nuclear receptors. Finally, synthesis and catabolism of FAs affect adaptive immunity by regulating lymphocyte metabolism. The present study performed genome-wide association studies using FA profiles of blood, liver, backfat and muscle from 432 commercial Duroc pigs.
A million-cow genome-wide association study of productive life in U.S. Holstein cows
Productive life (PL) of a cow is the time the cow remains in the milking herd from first calving to exit from the herd due to culling or death and is an important economic trait in U.S. Holstein cattle. The large samples of Holstein genomic evaluation data that have become available recently provided unprecedented statistical power to identify genetic factors affecting PL in Holstein cows using the approach of genome-wide association study (GWAS).
QTL analysis to identify genes involved in the trade-off between silk protein synthesis and larva-pupa transition in silkworms
Insect-based food and feed are increasingly attracting attention. As a domesticated insect, the silkworm (Bombyx mori) has a highly nutritious pupa that can be easily raised in large quantities through large-scale farming, making it a highly promising source of food. The ratio of pupa to cocoon (RPC) refers to the proportion of the weight of the cocoon that is attributed to pupae, and is of significant value for edible utilization, as a higher RPC means a higher ratio of conversion of mulberry leaves to pupa. In silkworm production, there is a trade-off between RPC and cocoon shell ratiao(CSR), which refers the ratio of silk protein to the entire cocoon, during metamorphosis process. Understanding the genetic basis of this balance is crucial for breeding edible strains with a high RPC and further advancing its use as feed.
Combined genomic evaluation of Merino and Dohne Merino Australian sheep populations
The Dohne Merino sheep was introduced to Australia from South Africa in the 1990s. It was primarily used in crosses with the Merino breed sheep to improve on attributes such as reproduction and carcass composition. Since then, this breed has continued to expand in Australia but the number of genotyped and phenotyped purebred individuals remains low, calling into question the accuracy of genomic selection. The Australian Merino, on the other hand, has a substantial reference population in a separate genomic evaluation (MERINOSELECT). Combining these resources could fast track the impact of genomic selection on the smaller breed, but the efficacy of this needs to be investigated. This study was based on a dataset of 53,663 genotypes and more than 2 million phenotypes. Its main objectives were (1) to characterize the genetic structure of Merino and Dohne Merino breeds, (2) to investigate the utility of combining their evaluations in terms of quality of predictions, and (3) to compare several methods of genetic grouping. We used the 'LR-method' (Linear Regression) for these assessments.
Segregation GWAS to linearize a non-additive locus with incomplete penetrance: an example of horn status in sheep
The objective of this study was to introduce a genome-wide association study (GWAS) in conjunction with segregation analysis on monogenic categorical traits. Genotype probabilities calculated from phenotypes, mode of inheritance and pedigree information, are expressed as the expected allele count (EAC) (range 0 to 2), and are inherited additively, by definition, unlike the original phenotypes, which are non-additive and could be of incomplete penetrance. The EAC are regressed on the single nucleotide polymorphism (SNP) genotypes, similar to an additive GWAS. In this study, horn phenotypes in Merino sheep are used to illustrate the advantages of using the segregation GWAS, a trait believed to be monogenic, affected by dominance, sex-dependent expression and likely affected by incomplete penetrance. We also used simulation to investigate whether incomplete penetrance can cause prediction errors in Merino sheep for horn status.
A comprehensive atlas of nuclear sequences of mitochondrial origin (NUMT) inserted into the pig genome
The integration of nuclear mitochondrial DNA (mtDNA) into the mammalian genomes is an ongoing, yet rare evolutionary process that produces nuclear sequences of mitochondrial origin (NUMT). In this study, we identified and analysed NUMT inserted into the pig (Sus scrofa) genome and in the genomes of a few other Suinae species. First, we constructed a comparative distribution map of NUMT in the Sscrofa11.1 reference genome and in 22 other assembled S. scrofa genomes (from Asian and European pig breeds and populations), as well as the assembled genomes of the Visayan warty pig (Sus cebifrons) and warthog (Phacochoerus africanus). We then analysed a total of 485 whole genome sequencing datasets, from different breeds, populations, or Sus species, to discover polymorphic NUMT (inserted/deleted in the pig genome). The insertion age was inferred based on the presence or absence of orthologous NUMT in the genomes of different species, taking into account their evolutionary divergence. Additionally, the age of the NUMT was calculated based on sequence degradation compared to the authentic mtDNA sequence. We also validated a selected set of representative NUMT via PCR amplification.
Analysis of the genetic variance of fibre diameter measured along the wool staple for use as a potential indicator of resilience in sheep
The effects of environmental disturbances on livestock are often observed indirectly through the variability patterns of repeated performance records over time. Sheep are frequently exposed to diverse extensive environments but currently lack appropriate measures of resilience (or sensitivity) towards environmental disturbance. In this study, random regression models were used to analyse repeated records of the fibre diameter of wool taken along the wool staple (bundle of wool fibres) to investigate how the genetic and environmental variance of fibre diameter changes with different growing environments.
A computationally feasible multi-trait single-step genomic prediction model with trait-specific marker weights
Regions of genome-wide marker data may have differing influences on the evaluated traits. This can be reflected in the genomic models by assigning different weights to the markers, which can enhance the accuracy of genomic prediction. However, the standard multi-trait single-step genomic evaluation model can be computationally infeasible when the traits are allowed to have different marker weights.
Marker effect p-values for single-step GWAS with the algorithm for proven and young in large genotyped populations
Single-nucleotide polymorphism (SNP) effects can be backsolved from ssGBLUP genomic estimated breeding values (GEBV) and used for genome-wide association studies (ssGWAS). However, obtaining p-values for those SNP effects relies on the inversion of dense matrices, which poses computational limitations in large genotyped populations. In this study, we present a method to approximate SNP p-values for ssGWAS with many genotyped animals. This method relies on the combination of a sparse approximation of the inverse of the genomic relationship matrix ( ) built with the algorithm for proven and young ( ) and an approximation of the prediction error variance of SNP effects which does not require the inversion of the left-hand side (LHS) of the mixed model equations. To test the proposed p-value computing method, we used a reduced genotyped population of 50K genotyped animals and compared the approximated SNP p-values with benchmark p-values obtained with the direct inverse of LHS built with an exact genomic relationship matrix ( . Then, we applied the proposed approximation method to obtain SNP p-values for a larger genotyped population composed of 450K genotyped animals.
Meta-analysis of six dairy cattle breeds reveals biologically relevant candidate genes for mastitis resistance
Mastitis is a disease that incurs significant costs in the dairy industry. A promising approach to mitigate its negative effects is to genetically improve the resistance of dairy cattle to mastitis. A meta-analysis of genome-wide association studies (GWAS) across multiple breeds for clinical mastitis (CM) and its indicator trait, somatic cell score (SCS), is a powerful method to identify functional genetic variants that impact mastitis resistance.
Investigating the footprint of post-domestication dispersal on the diversity of modern European, African and Asian goats
Goats were domesticated in the Fertile Crescent about 10,000 years before present (YBP) and subsequently spread across Eurasia and Africa. This dispersal is expected to generate a gradient of declining genetic diversity with increasing distance from the areas of early livestock management. Previous studies have reported the existence of such genetic cline in European goat populations, but they were based on a limited number of microsatellite markers. Here, we have analyzed data generated by the AdaptMap project and other studies. More specifically, we have used the geographic coordinates and estimates of the observed (H) and expected (H) heterozygosities of 1077 European, 1187 African and 617 Asian goats belonging to 38, 43 and 22 different breeds, respectively, to find out whether genetic diversity and distance to Ganj Dareh, a Neolithic settlement in western Iran for which evidence of an early management of domestic goats has been obtained, are significantly correlated.
Genetic diversity of United States Rambouillet, Katahdin and Dorper sheep
Managing genetic diversity is critically important for maintaining species fitness. Excessive homozygosity caused by the loss of genetic diversity can have detrimental effects on the reproduction and production performance of a breed. Analysis of genetic diversity can facilitate the identification of signatures of selection which may contribute to the specific characteristics regarding the health, production and physical appearance of a breed or population. In this study, breeds with well-characterized traits such as fine wool production (Rambouillet, N = 745), parasite resistance (Katahdin, N = 581) and environmental hardiness (Dorper, N = 265) were evaluated for inbreeding, effective population size (N), runs of homozygosity (ROH) and Wright's fixation index (F) outlier approach to identify differential signatures of selection at 36,113 autosomal single nucleotide polymorphisms (SNPs).