Tracking male DNA transfer and survival under female victim fingernails: Insights from a 24-h scratch simulation
In forensic investigations, the collection of biological material under fingernails may provide important evidence in cases of physical or sexual assault. Among these, scenarios suggesting alternative activities for the presence/absence of the DNA rather than questioning the donor of the trace are particularly challenging. To provide data supporting the interpretation of these cases, we investigated the transfer, persistence, and presence of background male DNA under female fingernails in controlled experiments of simulated scratching. Unlike previous studies, subungual samples were collected over short and long periods, up to 24 h after the scratching without preliminary cleaning of the nails. Y-STRs data showed that the DNA of the male individual scratched by a woman was detected in fingernail samples collected immediately and up to 6 h post-scratching. A notable decrease in male DNA quantity was observed after the first 3 h of scratching. Interestingly, the same foreign Y-STR profiles, different from the participating individuals, were observed between 6 and 24 h post-simulation. Overall, our data confirm that the detection of the offender's DNA from subungual samples is very likely immediately after the assault; yet, persistent background or newly transferred DNA may challenge the interpretation of traces collected after 6 h. Finally, one scenario was discussed to illustrate the value of these data for evaluating fingernail evidence when considering activity-level propositions.
Propositions used to assess the value of forensic DNA mixtures in an FSWG-ISFG interlaboratory comparison
DNA interpretation relies on the evaluation of results under at least two mutually exclusive propositions. This study evaluates the application of the International Society for Forensic Genetics (ISFG) recommendations concerning the formulation of such propositions across 15 laboratories in six countries and examines how evaluations are conducted when two persons of interest are considered. They were asked to assess results considering propositions about the source of the DNA during an interlaboratory comparison organized by the French Speaking Working Group of the ISFG. This article focuses on a DNA mixture from a mock case involving a complainant and two persons of interest. Seven of the eight ISFG recommendations were applicable to this interlaboratory comparison, with six being implemented by all laboratories. However, when two persons of interest were submitted for comparison without further case information, only seven laboratories followed Recommendation 3 by assigning a different likelihood ratio (LR) to each potential contributor. One of them used multiple propositions (more than two mutually exclusive propositions) and considered that each person, in turn, could or could not be the source of the DNA with or without the other person. The eight remaining laboratories assigned only one LR considering that both persons were contributors, or neither. As stated in the ISFG recommendations, such a practice should be avoided as it could lead to an overestimation of the LR for one of the contributors. We also demonstrate the effect of considering, or not, the presence of the DNA of persons whose contribution to the DNA mixture was not disputed (i.e., "conditioning" the DNA results on the DNA profiles of the undisputed source) on the LR. When conditioning is applied in ground truth experiments, the results provide stronger support for the proposition known to be true compared to the alternative. More precisely, the LR increased by a factor of 100-10'000 when conditioning, depending on the laboratory. The LRs assigned in two real cases are presented to illustrate the need to consider new information, such as the presence of a potential contributor to a DNA mixture, when evaluating results, and multiple propositions when several persons of interest are considered. It can significantly change the LR value.
A CE-based mRNA profiling method including six targets to estimate the time since deposition of blood stains
An association between RNA degradation and the time since deposition (TsD) of a biological stain has previously been demonstrated. Despite the encouraging results obtained with several RNA markers, the variability in results between individuals and analytical approaches limits the method's application in casework. The incorporation of multiple markers into a single prediction model could enhance estimation accuracy. Typically, real-time qPCR has been the primary analytical platform for these studies. However, qPCR requires high sample volumes and involves numerous pipetting steps when analysing multiple markers, increasing the risk of errors. In this study, we aim to optimize the TsD analysis by combining six targets in three mRNA markers (S100A12, LGALS2 and CLC) in a PCR multiplex and transitioning the analysis platform from qPCR to capillary electrophoresis (CE). This collaborative effort between the Department of Forensic Research at Oslo University Hospital (Laboratory 1) and the Zurich Institute of Forensic Medicine (ZIFM, Laboratory 2) analysed a total of six sample sets, spanning a period of 0 days up to 1.5 years (551 days), along with a broad set of test samples including different carrier materials. Furthermore, a machine learning model was employed to predict the age of bloodstains, aiming to enhance the precision and reliability of TsD estimations.
Assessing transcriptomic signatures of aging: Testing an mRNA marker panel for forensic age estimation of blood samples
Estimating the age of an unknown perpetrator can be a valuable tool in narrowing down a group of suspects. Research efforts to estimate the age of a stain donor have mainly focused on epigenetic modifications, but there is evidence that RNA expression patterns, i.e. the composition of the transcriptome, change with increasing age, which could be a promising molecular alternative for age prediction. In a previous study, we identified a total of 508 mRNA markers with age related expression from two blood whole transcriptome sequencing data sets, using differential expression analysis with DESeq2 and marker selection with lasso regression. For this study, the selected markers from both approaches were combined into an RNA-specific targeted MPS assay for the Ion Torrent platform and evaluated with 100 EDTA blood samples from healthy donors (aged between 23 and 73 years). We compared three different normalization methods for the obtained sequencing data and investigated the performance of various regression techniques for age prediction. The model based on elastic net regression and dSVA-normalized data exhibited the most robust performance, achieving an MAE of 9.29 years and a correlation of 0.57 between the chronological and predicted age. Although the use of a targeted approach instead of RNA-Seq offers several advantages in a forensic setting, we observed a considerable amount of unwanted variation in the targeted sequencing data. We conclude that it is challenging to detect distinct signals associated with chronological age.
A biogeographical ancestry inference pipeline using PCA-XGBoost model and its application in Asian populations
Biogeographical ancestry (BGA) inference plays a crucial role in genetics, anthropology, forensic science, and medical research. Current methods like principal component analysis (PCA) and ADMIXTURE, based on single nucleotide polymorphisms, are commonly used. Here, we introduce a bio-geographical ancestry inference pipeline that integrates prior population structure and clustering. Our pipeline first analyzes genetic structure on cleaned data to obtain optimal parameters and classification model labels. An XGBoost (eXtreme Gradient Boosting) classification model is constructed using principal components from PCA, and model predictions are evaluated with LR (likelihood ratio). The pipeline was applied to a dataset of Asian populations, with a first prediction accuracy of 96.27 % achieved. The LR-based evaluation accuracy reached 98.96 %, showing an improvement of 2.69 % with the introduction of LR assessment. This highlights the robust predictive capability of our pipeline and the improved accuracy in evaluation with LR. This successful application will benefit genetic research, human history studies, and criminal investigations. Additionally, the pipeline's versatility allows application to new datasets.
Differences of tsRNA expression profiles efficiently discriminate monozygotic twins in peripheral blood
Monozygotic twins (MZTs) share nearly identical genomic DNA sequences, making traditional forensic short tandem repeats (STR) genotyping methods ineffective for distinguishing between them. In recent years, the use of epigenetic factors in forensic applications has gained traction. The dynamic epigenetic factors can be influenced by inherited traits or acquired environmental factors. This study analyzed the expression profiles of transfer RNA-derived small RNAs (tsRNAs) in peripheral blood from four pairs of adult MZTs using Panoramic RNA Display by Overcoming RNA Modification Aborted Sequencing (PANDORA-seq). Differentially expressed tsRNAs (DEtsRNAs) were identified and validated using the reverse-transcriptase quantitative polymerase chain reaction (RT-qPCR) and droplet digital PCR (ddPCR) in both adult and newborn MZTs. The study also evaluated the longitudinal temporal stability, resistance to degradation, and suitability of DEtsRNAs for aged bloodstains. A total of 8795 expressed tsRNAs were identified in the four pairs of adult MZTs by PANDORA-seq. After screening with a normalized | log (fold change) | > 1 and an adjusted p-value < 0.05, 10, 187, and 1520 DEtsRNAs were shared by 4, 3, and 2 pairs of MZTs. RT-qPCR and ddPCR confirmed the expression of the 10 DEtsRNAs identified by PANDORA-seq. Six candidate tsRNAs (tRNA-Gly-GCC, tRNA-Leu-TAA, tRNA-Lys-CTT, tRNA-Val-AAC_5_end, tRNA-iMet-CAT_5_end, and tsRNA-3023a/b-PheGAA) were identified as effective discrimination markers, even in neonatal MZTs which are largely unaffected by environment factors. Forensic applicability assessment revealed that tRNA-Gly-GCC and tRNA-Leu-TAA remained detectable in the 180-day-series bloodstains, while tRNA-Lys-CTT, tRNA-Val-AAC_5_end, and tRNA-iMet-CAT_5_end were relatively stable after 15 times of freeze-thaw cycles. Additionally, tRNA-Gly-GCC and tRNA-Lys-CTT exhibited long-term stability, with consistent expression over six months. In conclusion, this study demonstrates that differential tsRNAs expression can serve as a novel biomarker for MZT identification in forensic medicine.
Inter-platform evaluation of the MPSplex large-scale tri-allelic SNP panel for forensic identification
MPSplex is a large-scale forensic massively parallel sequencing (MPS) panel with 1,270 tri-allelic SNPs, 44 microhaplotypes (MH) and 55 ancestry-informative bi-allelic SNPs (aiSNPs) designed for missing persons identification. We have evaluated MPSplex with the most widely used MPS platforms in the forensic field: the Illumina MiSeq, the Thermo Fisher Scientific Ion S5 and the Qiagen GeneReader. The tri-allelic SNPs of MPSplex were previously identified from the most polymorphic loci with three common alleles in 1000 Genomes Phase III data and combined with the 44 MH and 55 aiSNPs, then implemented into a QIAseq Targeted DNA Custom Panel (Qiagen), a marker panel which uses Unique Molecular Indices or UMIs. The UMI random-sequence DNA molecules are incorporated onto DNA fragments before the Target Enrichment PCR, allowing the identification of reads that originated from the same template and consequently they can be used to correct the errors that may arise within the PCR or the sequencing process. In this study, we present the results of an inter-platform evaluation of the MPSplex panel, characterizing its performance in different forensic scenarios, which assessed aspects that include sensitivity, genotyping accuracy and mixture analysis. MPSplex aims to provide a tool designed for kinship analysis that can be applied beyond the resolution of first- or second-degree relationships, avoiding the need for much bigger forensic panels designed for genealogy purposes, which usually require significantly more sequencing resources. This study provides evaluation of MPSplex using the MPS systems in routine use for forensic genotyping of large-scale panels of SNPs.
DNA methylation-based semen age prediction using the markers identified in Koreans and Europeans
In the forensic field, sexual assaults have consistently been the important issue, with semen frequently serving as the primary evidence. When the suspect is unidentified, estimating the perpetrator's age using investigating semen can provide important information. The VISAGE consortium conducted research on the semen age prediction focused on European semen samples, but the age prediction model has remained undisclosed. Additionally, several studies have reported methylation differences across populations, indicating that the European semen age prediction model might not be broadly applicable to other groups. A study did explore semen age prediction in Koreans using Illumina's Infinium Methylation450K BeadChip array, however recent developments in technology could enhance this approach. To address this, we conducted a study on Korean males aged 18-70 years. We initially analyzed 49 samples utilizing Illumina's Infinium MethylationEPIC BeadChip array to identify age-related CpG sites. From this analysis, we identified 9 age-related CpG markers, excluding one due to difficulties in locus-specific analysis. As a result, we used 11 markers including 8 newly identified CpGs from the EPIC array and 3 CpG markers from previous research utilizing the SNaPshot assay. Furthermore, we incorporated 13 CpG markers from the European study to analyze a total of 159 semen samples using the Illumina Nextera MPS system. This approach enabled us to test age-related markers identified in Europeans within the Korean population and to construct a more accurate age prediction model using markers from both Korean and European sources.
Genetic predictions of eye and hair colour in the Danish population
Genetic predictions of eye and hair colour are prominent examples of forensic DNA phenotyping that can help resolve criminal cases. The advent of high-throughput genotyping technologies in forensic genetics opens up the possibility of applying polygenic risk scores in forensic settings. In this work, we compare the performance of HIrisPlex with PRSice-2 in predicting eye and hair colour to gain insights into the relative benefits of new approaches. Predictions were carried out on 584 Danish high school students for which genetic and self-reported phenotype data were available. Prediction of brown eye colour was very accurate (AUC = 0.98), followed by blue eye colour (AUC = 0.82), while it failed for intermediate eye colour (AUC = 0.57). As for hair colour, red and black were overall better predicted than blond and brown, and PRSice-2 performed better in all but the black hair colour. Despite the limitations of the study, HIrisPlex exhibited its usual high performance in the prediction of brown and blue eye colour, as well as red and black hair colour. However, PRSice-2 offered overall improvements in hair colour prediction over HIrisPlex suggesting that there is room for improvement in forensic DNA phenotyping by using polygenic risk scores.
Species level and SNP profiling of skin microbiome improve the specificity in identifying forensic fluid and individual
Human skin possesses individual and body fluid-specific microbial signatures potentially useful for forensic identification. Previous studies mostly attribute individuals based on the relative abundance of microbiota at single time point, however fluctuations in taxonomy and phylogenetic structure may cause this to be unreliable. In this study, we assessed the skin microbiome of individuals at consecutive time-point from fingers, palm, arm and forehead sites using full-length 16S rRNA gene sequencing. At the species level, hand samples (fingers, palm, arm) differed significantly from forehead microbes. Additionally, skin flora of the present study differed significantly from the dominant species that have been reported for saliva, feces, and vaginal secretions samples. ANOSIM analysis of all skin samples showed that inter-individual differences were greater than intra-individual differences, yet accuracy of individual identification was only 52.5 %. At the microbial gene level, three machine learning models based on single nucleotide polymorphism (SNP) profiles of Cutibacterium acnes resulted in accurate classification of more than 97.5 % individuals. These results indicate that consideration of bacterial SNP profiling may provide new directions for forensic identification and may have potential applications in body fluid identification and individual identification in forensic.
Forensic investigative genetic genealogy using genotypes generated or imputed from transcriptomes
The utility of transcriptome analysis in forensic genetics is steadily increasing. The transcriptome, with its ability to reflect both transcript levels and their nucleotide sequences, is proving to be useful for a variety of different applications, including body fluid identification and donor assignment, thereby providing both genetic and contextual information. Furthermore, the substantial single nucleotide polymorphism (SNP) coverage obtainable with whole transcriptome sequencing may prove useful for additional applications. In this study, we expand the current knowledge of transcriptomics in forensic genetics by showing how RNA can be used for forensic investigative genetic genealogy (FIGG) purposes and inference of distant relationships. Genetic data was simulated for relationships ranging from full siblings (first-degree relatives) to third cousins (seventh-degree relatives). The sets of SNP genotypes were subsequently reduced to only include observed and imputed SNP genotypes at loci covered by transcriptome sequencing of whole blood. The relationships of relatives as distant as second cousins could be reliably classified based on an average of 99,548 SNPs. Appropriate thresholds for sequence quality parameters limited the rate of erroneous genotype calls, with the remaining errors proving to have little to no effect on relationship inference. In conclusion, we present a proof-of-concept study on how transcriptome-based genotypes, in combination with imputed genotypes, may be used for reliable relationship inference for FIGG purposes.
Nanopore sequencing of MiniHap biomarkers for forensic DNA mixture deconvolution: A proof-of-principle study
Mixture deconvolution remains one of the major challenges in the field of forensic science. Currently, genetic markers are used and studied in the field of forensic genetics, including short tandem repeat (STR), insertion/deletion polymorphism (InDel), single nucleotide polymorphism (SNP), InDel closely linked to STR (DIP-STR), SNP closely linked to STR (SNP-STR), InDel closely linked to SNP (DIP-SNP) and microhaplotype (MH), all of which have been studied for DNA mixture analysis and have their own advantages and disadvantages. Mini-haplotype (MiniHap), as a novel haplotype genetic marker, contains 5 or more SNPs. A previous study has substantiated its significant high polymorphic characteristics, and it is expected to have potential applications in individual identification, paternity testing, ancestry inference, and mixture deconvolution. In this study, we first screened 22 MiniHaps with high polymorphism potential and constructed a panel based on the QNome nanopore sequencing device. Subsequently, we tested 100 unrelated Chinese Han individuals to evaluate the sequencing performance, allele (haplotype) frequencies, effective number of alleles (A) and forensic parameters of the 22 MiniHaps markers included in this novel assay. Next, a series of mixture simulations (two- or three-person mixtures with mixing ratios of 1:1-1:99 or 1:1:1-1:8:1) based on three standard materials (9947 A, 9948 and 2800 M) were detected by this MiniHap panel to explore its potential for DNA mixture deconvolution. The average A value was 10.9574, and 52.38 % of MiniHap loci had A values greater than 12.0000. The mean values of genetic diversity (GD) and power of discrimination (PD) were 0.8717 and 0.9457, respectively. Notably, most MiniHaps (85.71 %) had PD values exceeding 0.9000. The combined match probability (CMP) and combined power of exclusion (CPE) of this MiniHap panel were 4.4505 × 10 and 0.999999999999999996653, respectively. Moreover, the results of mixture analysis demonstrated that this MiniHap panel allowed detecting the components of minor contributor (s) even in imbalanced mixture samples, with detection limits of 1:39 and 1:8:1 for two- and three-person mixtures, respectively. In summary, MiniHap markers have remarkable application potential in mixture deconvolution, and it is necessary to conduct in-depth research on MiniHap markers for mixture analysis in the future.
Kinship cases with partially specified hypotheses
Forensic kinship testing is the statistical comparison of a set of hypothesised relationships, based on genetic marker data from the individuals in question and possibly other relatives. In most circumstances each hypothesis is completely specified in terms of a pedigree, but this is not always the case in more complex scenarios. For example, suppose that we are asked to test H: A is the grandmother of B, against H: A and B are unrelated, and that the data also includes a third individual whose relationship with the others is uncertain. There may then be multiple pedigrees consistent with each hypothesis, with the consequence that the standard likelihood ratio (LR) cannot be calculated unless prior probabilities are specified for all alternatives. In response to these challenges we introduce a generalised likelihood ratio (GLR), defined as the ratio of the maximal likelihood of the data given H to the maximal given H. This resembles a version of the LR test used in classical hypothesis testing, but differs in several aspects. Most importantly, in the forensic setting we usually consider discrete alternatives rather than continuous parameter spaces. The properties of the GLR statistic are explored through real-life examples of kinship testing and disaster victim identification (DVI). In particular, we demonstrate how the GLR may help to resolve and report the results in complex DVI cases. As a final application we demonstrate how the GLR can be used to check correctness of pedigree data, an essential quality control step in projects involving genotypes from related individuals. Unlike the other examples, this one operates over a continuous parameter space, enabling tools from classical statistics to guide decision-making.
Evaluating the effect of marker panel sizes on estimation of bio-geographical co-ancestry proportions
A large number of ancestry-informative marker panels have been developed for forensic bio-geographical ancestry (BGA) analysis during the past decade, which offer valuable investigative tools for cold cases. The developed assays for capillary electrophoresis (CE) and massively parallel sequencing (MPS) focus on the differentiation of major populations, with MPS allowing larger numbers of markers that can be multiplexed at the same time and therefore improved differentiation of more closely related Eurasian populations. One limitation of BGA inference tools is the handling of co-ancestry in individuals with admixted backgrounds, which leads to two situations being indistinguishable: (i) the individual belongs to an admixed population, or (ii) the individual has recent ancestors from different populations. Accurate and precise co-ancestry estimates can help, as first or second-degree admixture would show a ∼ 50-50 % or ∼ 75-25 % ratio of co-ancestry proportions. Here we compared the co-ancestry proportion estimations obtained for the set of 2504 individuals from the 1000 Genomes Project with dedicated BGA and human identification (ID) assays of different sizes compared to those obtained with the > 500,000 SNP Affymetrix Human Origins panel as the point of reference for each individual. The results of the correlation analysis performed with > 500 admixed individuals indicate that panel size plays a major role in the accuracy of the co-ancestry estimates. Therefore, the large-scale forensic MPS ID panels we evaluated constitute a valuable alternative to small- and medium-scale BGA panels, especially when admixture is expected.
Forensic insights from shotgun metagenomics: Tracing microbial exchange during sexual intercourse
The microbiome is becoming an emerging field of interest within forensic science with high potential for individualization; however, little is known about bacterial species specific to the genital area or their ability to transfer between individuals during sexual contact. In this proof-of-concept study, we investigated microbial transfer dynamics in seven monogamous, heterosexual couples by collecting pre- and post-sexual intercourse samples from their genital areas, including penile, vaginal, and labial locations. Utilizing Shotgun Metagenomic Sequencing, we sequenced the microbial profiles of these samples. Our findings reveal significant transfer from the vaginal microbiome onto the penile microbiome, predominantly originating from the labial genitalia. Moreover, strain analysis unveiled distinct differentiation between the same species of bacteria across individuals, underscoring the potential for microbial forensics to distinguish individuals. This study contributes to our understanding of microbial transfer during sexual contact and highlights the forensic implications of the genital microbiome.
Mitochondrial genome sequencing with ForenSeq™ mtDNA Whole Genome Kit
Mitochondrial DNA (mtDNA) possesses unique genetic characteristics and plays a crucial role in forensic DNA analysis. Based on the massively parallel sequencing (MPS) technology alongside the short overlapping amplicon method, the ForenSeq™ mtDNA Whole Genome Kit is specifically designed for mtDNA analysis. In this study, we employ the ForenSeq™ mtDNA Whole Genome Kit on the MiSeq FGx® Sequencing System for mitochondrial genome (mtGenome) sequencing across nine consecutive runs and assess its MPS performance, such as read depth (RD), forward/reverse strand bias (SB), and mtGenome coverage. Furthermore, we conduct internal validations to evaluate its routine application in forensic sciences, including sensitivity, repeatability, concordance, degraded samples, inhibitor samples, case-type samples, and contamination. As a result, the Real-Time Analysis (RTA) and Universal Analysis Software (UAS) demonstrate proficient run metrics and MPS performance when 12-14 libraries are sequenced within a standard flow cell, achieving > 80 % of reads passing filter, > 80 % bases with ≥Q30, > 5000 × of the average RD, ∼1.0 of the average SB, > 70 % of the inter-amplicon balance, and > 99 % of the mtGenome coverage. The five most vulnerable amplicons, exhibiting low RD and high SB, are identified as nucleotide positions (nps) 1094-1177, 5858-5975, 6109-6149, 6718-6810, and 7021-7090. For tertiary data analysis, the substitutions are accurately reported by UAS, while insertions and deletions (indels), point heteroplasmies (PHPs), and/or length heteroplasmies (LHPs) still necessitate manual inspection. On average, 40 variants were found in 60 samples, ranging from 27 to 54. A total of 2426 variants are observed at 491 nps. Moreover, the workflow can yield repeatable and reproducible results, generate complete mtGenome profiles from ≥ 2 pg input gDNA for high quality samples/control DNA or ≥ 0.5 cm hair shafts, and recover more/complete mtGenome information from severely degraded samples (degradation index >10) and various types of case samples. If two rounds of purification are conducted, it can more effectively remove additional reaction components and enhance data recovery from the mtGenome, especially for low-input samples. The negative controls in three runs cover some reads, but these contaminations cannot compromise the mitochondrial analyses. In conclusion, the ForenSeq™ mtDNA Whole Genome Kit, including 234 short overlapping amplicons with an average size of 131 bp, can meet forensic needs on the whole mtGenome sequencing in real scenarios. In addition, the ten insights gained from this study may serve as a valuable reference for forensic scientists who are utilizing this kit.
A continuous model for interpreting microhaplotype profiles of forensic DNA mixtures
Microhaplotypes (MHs) have great potential in forensic DNA analysis, with applications in individual identification, kinship analysis and ancestry inference. No matter the forensic application, the analysis of DNA mixtures may be encountered. This study aims to develop and evaluate a continuous model for interpreting mixed genotype data from MH markers. We characterized MH profile features and modeled allele read counts using a truncated Gaussian distribution, accounting for allele dropout, noise, and locus-specific detection efficiency. The model was tested on 90 DNA mixtures generated from nine unrelated individuals across various mixture proportions. Likelihood ratio (LR) values were computed for both true contributors and non-contributors, and mixture deconvolution was performed. Results demonstrated high accuracy and specificity in interpreting MH profiles for 2- to 3-person DNA mixtures: true contributors obtained LR values greater than 1 in 190 out of 200 LR calculations. In 26,700 simulated non-contributor tests, for 2-person mixtures, the proportion of non-contributors with an LR greater than 1 was 0.0051 %; for 3-person mixtures, this proportion was 4.68 %. Excluding balanced individuals in mixtures, the average deconvolution accuracy rate for major contributors was 0.9145, with 60.98 % (100/164) achieving an accuracy rate of 1. Additionally, we observed that distinguishing alleles from non-alleles became increasingly challenging with higher mixture proportions or additional contributors, with noise identified as a critical factor affecting genotyping accuracy.
Are microhaplotypes derived from the 1000 Genomes Project reliable for forensic purposes?
Microhaplotypes (MHs) have emerged as an important genetic marker in forensic genetics. However, most studies have overlooked the potential for phasing errors within microhaplotypes based on the 1000 Genome Project (1kGP), which may impact the evaluation of various forensic parameters and lead to misleading results. In this study, we constructed a dense and extensive set of MHs across the human genome, using the expanded 1000 Genomes Project data aligned to GRCh38 reference genome. We applied three different SNP minor allele frequency (MAF) thresholds (0, 0.01, and 0.05) to evaluate the reliability of these markers. Utilizing pedigree data from 18 populations, which included a total of 602 trios, we scanned for and confirmed suspected phasing error events at these MH loci. We also sequenced 50 MHs for one trio of the Southern Han Chinese (CHS) population to further investigate these discrepancies. The results revealed the presence of phasing errors in MHs from 1kGP when analyzed using targeted enrichment and next-generation sequencing. The probability of suspected phasing error events was strongly and positively correlated with the effective number of alleles (Ae) and informativeness (In) of the markers. Additionally, these mismatch probabilities varied significantly across different continental populations. Additionally, when selecting loci, applying MAF filtering and avoiding regions such as the MHC can reduce the occurrence of such events to some extent. Based on these findings, we suggest that relying solely on sequencing data of the 1kGP for forensic purpose may be risky. A thorough investigation of the true forensic parameters of MHs is essential to ensure their reliability in forensic applications.
Unprecedented male relative differentiation with Y-SNVs from whole genome sequencing
The principal limitation of forensic Y-STR analysis, which identifies a male lineage rather than an individual man, is being addressed by the discovery and application of rapidly mutating Y-STRs (RM Y-STRs). Due to their higher mutation rates compared to standard Y-STRs used in forensics, RM Y-STRs significantly enhance the ability to differentiate between male relatives. However, some male relatives - particularly closely related ones - remain indistinguishable. Given the design and execution of the two previous RM Y-STR searches that discovered the 26 currently known RM Y-STRs, it is unlikely that future searches will largely increase the number of RM Y-STRs. To address the ongoing forensic challenge of differentiating between male relatives using Y chromosome analysis, this study explorers an alternative approach: Y-chromosomal singe nucleotide variants (Y-SNVs) obtained via whole genome sequencing (WGS). To assess the feasibility of the WGS technology in differentiating closely and distantly related males, we sequenced DNA samples of 24 male individuals belonging to three deep-rooted pedigrees, covering 12 father-son pairs and 72 pairs of distant male relatives separated by 8-15 meioses. Among the 76 meioses analyzed in total, 90 male relative-differentiating Y-SNVs were identified across the approximately 25 Mbp Y chromosome sequence generated per sample. A total of 141 male relative-differentiating Y chromosome mutations were observed when also considering Y-STRs from Yfiler Plus, RMplex, and WGS analyses. Of the 12 father-son pairs, six (50 %) were differentiated by one or more Y-SNVs, and 9 (75 %) with WGS and CE methods combined. All of the 72 pairs of distant male relatives were distinguished both through Y-SNVs and RM Y-STRs. Overall, when compared to RMplex, WGS yielded a 1.7-fold increase in the number of observed mutations in father-son pairs and a 4-fold increase in distantly related males. Our proof-of-principle study demonstrates (i) the feasibility and high value of Y-SNV markers and WGS technology in differentiating both close and distant male relatives; (ii) the superior performance of Y-SNVs from WGS relative to the previously used RM Y-STR markers and RMplex method; and (iii) the enhanced male relative differentiation achieved by combining both marker types and methods. We envision WGS as the method of choice for maximizing male relative differentiation based on Y chromosome information in high-profile criminal cases with male suspects where no autosomal STR profiles are available and where standard Y-STR and RM Y-STR analyses fail to distinguish the suspect from his male paternal relatives.
Remains of the German outlaw Johannes Bückler alias Schinderhannes identified by an interdisciplinary approach
Two mounted skeletons assigned to the famous German criminals Schinderhannes and Hölzerlips were on display at the Anatomical Collection of Heidelberg University for two centuries. However, doubts about their authenticity existed for decades. Based on historical research, an interdisciplinary team with experts from the fields of anatomy, radiology, anthropology, genealogy and molecular biology set out to examine the remains from the following perspectives: (1) Isotope analyses were carried out to compare inferred childhood residences with historical narratives, (2) anthropological and radiological examinations were documented and compared with historical records, (3) genealogical research identified a living male descendant along the maternal line and (4) mitogenome sequencing as well as nuclear SNP analysis using the FORCE panel provided compelling evidence for the identification of Schinderhannes' remains. Additionally, the prediction of eye, hair and skin color from the DNA offered science-based data to clarify conflicting historical records.
Forensic Science International: Genetics: Past, present, and future of the journal and the field