How can concepts from ecology enable insights about cellular communities?
Systematic analysis identifies a connection between spatial and genomic variations of chromatin states
Chromatin states play important roles in the maintenance of cell identities, yet their spatial patterns remain poorly characterized at the organism scale. We developed a systematic approach to analyzing spatial epigenomic data and then applied it to a recently published spatial-CUT&Tag dataset that was obtained from a mouse embryo. We identified a set of spatial genes whose H3K4me3 patterns delineate tissue boundaries. These genes are enriched with tissue-specific transcription factors, and their corresponding genomic loci are marked by broad H3K4me3 domains. Integrative analysis with H3K27me3 profiles showed coordinated spatial transitions across tissue boundaries, which is marked by the continuous shortening of H3K4me3 domains and expansion of H3K27me3 domains. Motif-based analysis identified transcription factors whose activities change significantly during such transitions. Taken together, our systematic analyses reveal a strong connection between the genomic and spatial variations of chromatin states. A record of this paper's transparent peer review process is included in the supplemental information.
AlphaFold2 enables accurate deorphanization of ligands to single-pass receptors
Secreted proteins play crucial roles in paracrine and endocrine signaling; however, identifying ligand-receptor interactions remains challenging. Here, we benchmarked AlphaFold2 (AF2) as a screening approach to identify extracellular ligands to single-pass transmembrane receptors. Key to the approach is the optimization of AF2 input and output for screening ligands against receptors to predict the most probable ligand-receptor interactions. The predictions were performed on ligand-receptor pairs not used for AF2 training. We demonstrate high discriminatory power and a success rate of close to 90% for known ligand-receptor pairs and 50% for a diverse set of experimentally validated interactions. Further, we show that screen accuracy does not correlate linearly with prediction of ligand-receptor interaction. These results demonstrate a proof of concept of a rapid and accurate screening platform to predict high-confidence cell-surface receptors for a diverse set of ligands by structural binding prediction, with potentially wide applicability for the understanding of cell-cell communication.
Systematic screens for fertility genes essential for malaria parasite transmission reveal conserved aspects of sex in a divergent eukaryote
Sexual reproduction in malaria parasites is essential for their transmission to mosquitoes and offers a divergent eukaryote model to understand the evolution of sex. Through a panel of genetic screens in Plasmodium berghei, we identify 348 sex and transmission-related genes and define roles for unstudied genes as putative targets for transmission-blocking interventions. The functional data provide a deeper understanding of female metabolic reprogramming, meiosis, and the axoneme. We identify a complex of a SUN domain protein (SUN1) and a putative allantoicase (ALLC1) that is essential for male fertility by linking the microtubule organizing center to the nuclear envelope and enabling mitotic spindle formation during male gametogenesis. Both proteins have orthologs in mouse testis, and the data raise the possibility of an ancient role for atypical SUN domain proteins in coupling the nucleus and axoneme. Altogether, our data provide an unbiased picture of the molecular processes that underpin malaria parasite transmission. A record of this paper's transparent peer review process is included in the supplemental information.
Spatiotemporal dynamics during niche remodeling by super-colonizing microbiota in the mammalian gut
While fecal microbiota transplantation (FMT) has been shown to be effective in reversing gut dysbiosis, we lack an understanding of the fundamental processes underlying microbial engraftment in the mammalian gut. Here, we explored a murine gut colonization model leveraging natural inter-individual variations in gut microbiomes to elucidate the spatiotemporal dynamics of FMT. We identified a natural "super-donor" consortium that robustly engrafts into diverse recipients and resists reciprocal colonization. Temporal profiling of the gut microbiome showed an ordered succession of rapid engraftment by early colonizers within 72 h, followed by a slower emergence of late colonizers over 15-30 days. Moreover, engraftment was localized to distinct compartments of the gastrointestinal tract in a species-specific manner. Spatial metagenomic characterization suggested engraftment was mediated by simultaneous transfer of spatially co-localizing species from the super-donor consortia. These results offer a mechanism of super-donor colonization by which nutritional niches are expanded in a spatiotemporally dependent manner. A record of this paper's transparent peer review process is included in the supplemental information.
The master regulator OxyR orchestrates bacterial oxidative stress response genes in space and time
Bacteria employ diverse gene regulatory networks to survive stress, but deciphering the underlying logic of these complex networks has proved challenging. Here, we use time-resolved single-cell imaging to explore the functioning of the E. coli regulatory response to oxidative stress. We observe diverse gene expression dynamics within the network. However, by controlling for stress-induced growth-rate changes, we show that these patterns involve just three classes of regulation: downregulated genes, upregulated pulsatile genes, and gradually upregulated genes. The two upregulated classes are distinguished by differences in the binding of the transcription factor, OxyR, and appear to play distinct roles during stress protection. Pulsatile genes activate transiently in a few cells for initial protection of a group of cells, whereas gradually upregulated genes induce evenly, generating a lasting protection involving many cells. Our study shows how bacterial populations use simple regulatory principles to coordinate stress responses in space and time. A record of this paper's transparent peer review process is included in the supplemental information.
Tumor architecture and emergence of strong genetic alterations are bottlenecks for clonal evolution in primary prostate cancer
Prostate cancer (PCA) exhibits high levels of intratumoral heterogeneity. In this study, we developed a mathematical model to study the growth and genetic evolution of PCA. We explored the possible evolutionary patterns and demonstrated that tumor architecture represents a major bottleneck for divergent clonal evolution. Early consecutive acquisition of strong genetic alterations serves as a proxy for the formation of aggressive tumors. A limited number of clonal hierarchy patterns were identified. A biopsy study of synthetic tumors shows complex spatial intermixing of clones and delineates the importance of biopsy extent. Deep whole-exome multiregional next-generation DNA sequencing of the primary tumors from five patients was performed to validate the results, supporting our main findings from mathematical modeling. In conclusion, our model provides qualitatively realistic predictions of PCA genomic evolution, closely aligned with the evidence available from patient samples. We share the code of the model for further studies. A record of this paper's transparent peer review process is included in the supplemental information.
AlphaFold opens the doors to deorphanizing secreted proteins
Danneskiold-Samsøe and coworkers have developed an in silico screening pipeline based on AlphaFold2 for identifying single-pass transmembrane receptors for secreted peptides that play important roles in cell-cell signaling. Their approach can be used to deorphanize a diverse range of ligands. The overall strategy can be valuable in screening for weak and transient interactions.
Evaluation of Choudhary et al.: Single-cell gene expression dynamics in the E. coli oxidative stress response network
One snapshot of the peer review process for "The master regulator OxyR orchestrates bacterial oxidative stress response genes in space and time" (Choudhary et al., 2024)..
Plausible, robust biological oscillations through allelic buffering
Biological oscillators can specify time- and dose-dependent functions via dedicated control of their oscillatory dynamics. However, how biological oscillators, which recurrently activate noisy biochemical processes, achieve robust oscillations remains unclear. Here, we characterize the long-term oscillations of p53 and its negative feedback regulator Mdm2 in single cells after DNA damage. Whereas p53 oscillates regularly, Mdm2 from a single MDM2 allele exhibits random unresponsiveness to ∼9% of p53 pulses. Using allelic-specific imaging of MDM2 activity, we show that MDM2 alleles buffer each other to maintain p53 pulse amplitude. Removal of MDM2 allelic buffering cripples the robustness of p53 amplitude, thereby elevating p21 levels and cell-cycle arrest. In silico simulations support that allelic buffering enhances the robustness of biological oscillators and broadens their plausible biochemical space. Our findings show how allelic buffering ensures robust p53 oscillations, highlighting the potential importance of allelic buffering for the emergence of robust biological oscillators during evolution. A record of this paper's transparent peer review process is included in the supplemental information.
How do you anticipate computational protein design will change biotechnology and therapeutic development?
Protein turnover regulation is critical for influenza A virus infection
The abundance of a protein is defined by its continuous synthesis and degradation, a process known as protein turnover. Here, we systematically profiled the turnover of proteins in influenza A virus (IAV)-infected cells using a pulse-chase stable isotope labeling by amino acids in cell culture (SILAC)-based approach combined with downstream statistical modeling. We identified 1,798 virus-affected proteins with turnover changes (tVAPs) out of 7,739 detected proteins (data available at pulsechase.innatelab.org). In particular, the affected proteins were involved in RNA transcription, splicing and nuclear transport, protein translation and stability, and energy metabolism. Many tVAPs appeared to be known IAV-interacting proteins that regulate virus propagation, such as KPNA6, PPP6C, and POLR2A. Notably, our analysis identified additional IAV host and restriction factors, such as the splicing factor GPKOW, that exhibit significant turnover rate changes while their total abundance is minimally affected. Overall, we show that protein turnover is a critical factor both for virus replication and antiviral defense.
Exploring "dark-matter" protein folds using deep learning
De novo protein design explores uncharted sequence and structure space to generate novel proteins not sampled by evolution. A main challenge in de novo design involves crafting "designable" structural templates to guide the sequence searches toward adopting target structures. We present a convolutional variational autoencoder that learns patterns of protein structure, dubbed Genesis. We coupled Genesis with trRosetta to design sequences for a set of protein folds and found that Genesis is capable of reconstructing native-like distance and angle distributions for five native folds and three novel, the so-called "dark-matter" folds as a demonstration of generalizability. We used a high-throughput assay to characterize the stability of the designs through protease resistance, obtaining encouraging success rates for folded proteins. Genesis enables exploration of the protein fold space within minutes, unrestricted by protein topologies. Our approach addresses the backbone designability problem, showing that small neural networks can efficiently learn structural patterns in proteins. A record of this paper's transparent peer review process is included in the supplemental information.
SpotGF: Denoising spatially resolved transcriptomics data using an optimal transport-based gene filtering algorithm
Spatially resolved transcriptomics (SRT) combines gene expression profiles with the physical locations of cells in their native states but suffers from unpredictable spatial noise due to cell damage during cryosectioning and exposure to reagents for staining and mRNA release. To address this noise, we developed SpotGF, an algorithm for denoising SRT data using optimal transport-based gene filtering. SpotGF quantifies diffusion patterns numerically, distinguishing widespread expression genes from aggregated expression genes and filtering out the former as noise. Unlike conventional denoising methods, SpotGF preserves raw sequencing data, thereby avoiding false positives that can arise from imputation. Additionally, SpotGF demonstrates superior performance in cell clustering, identifying potential marker genes, and annotating cell types. Overall, SpotGF has the potential to become a crucial preprocessing step in the downstream analysis of SRT data. The SpotGF software is freely available at GitHub. A record of this paper's transparent peer review process is included in the supplemental information.
How can concepts from ecology enable insights about cellular communities?
A digital CRISPR-dCas9-based gene remodeling biocomputer programmed by dietary compounds in mammals
CRISPR-dCas9 (dead Cas9 protein) technology, combined with chemical molecules and light-triggered genetic switches, offers customizable control over gene perturbation. However, these simple ON/OFF switches cannot precisely determine the sophisticated perturbation process. Here, we developed a resveratrol and protocatechuic acid-programmed CRISPR-mediated gene remodeling biocomputer (REPA) for conditional endogenous transcriptional regulation of genes in vitro and in vivo. Two REPA variants, REPA and REPA, were designed for the logic control of gene inhibition and activation, respectively. We successfully demonstrated the digital computations of single or multiplexed endogenous gene transcription by using REPA. We also established mathematical models to predict the dose-responsive transcriptional levels of a target endogenous gene controlled by REPA. Moreover, high levels of endogenous gene activation in mice mediated by the AND logic gate demonstrated computational control of CRISPR-dCas9-based epigenome remodeling in mice. This CRISPR-based biocomputer expands the synthetic biology toolbox and can potentially advance gene-based precision medicine. A record of this paper's transparent peer review process is included in the supplemental information.
Putting proteins in context
Proteins exhibit cell-type-specific functions and interactions, yet most ways of representing proteins lack any biological or environmental context. To address this gap, recent work by Li et al. introduces PINNACLE, a geometric deep learning approach that generates contextualized representations of proteins by combined analysis of protein interactions and multiorgan single-cell transcriptomics.
Data-driven batch detection enhances single-cell omics data analysis
In single-cell omics studies, data are typically collected across multiple batches, resulting in batch effects: technical confounders that introduce noise and distort data distribution. Correcting these effects is challenging due to their unknown sources, nonlinear distortions, and the difficulty of accurately assigning data to batches that are optimal for integration methods.
Transcriptional memory formation: Battles between transcription factors and repressive chromatin
Transcriptional memory allows cells to respond to previously experienced signals in a faster, stronger, and more sensitive manner. Using synthetic biology approaches, Fan and colleagues uncovered the critical interplays between transcription factors and repressive chromatin in consolidating transcriptional memory.
Evolution in microbial microcosms is highly parallel, regardless of the presence of interacting species
Evolution often follows similar trajectories in replicate populations, suggesting that it may be predictable. However, populations are naturally embedded in multispecies communities, and the extent to which evolution is contingent on the specific species interacting with the focal population is still largely unexplored. Here, we study adaptations in strains of 11 different species, experimentally evolved both in isolation and in various pairwise co-cultures. Although partner-specific effects are detectable, evolution was mostly shared between strains evolved with different partners; similar changes occurred in strains' growth abilities, in community properties, and in about half of the repeatedly mutated genes. This pattern persisted even in species pre-adapted to the abiotic conditions. These findings indicate that evolution may not always depend strongly on the biotic environment, making predictions regarding coevolutionary dynamics less challenging than previously thought. A record of this paper's transparent peer review process is included in the supplemental information.
Markov field network model of multi-modal data predicts effects of immune system perturbations on intravenous BCG vaccination in macaques
Analysis of multi-modal datasets can identify multi-scale interactions underlying biological systems but can be beset by spurious connections due to indirect impacts propagating through an unmapped biological network. For example, studies in macaques have shown that Bacillus Calmette-Guerin (BCG) vaccination by an intravenous route protects against tuberculosis, correlating with changes across various immune data modes. To eliminate spurious correlations and identify critical immune interactions in a public multi-modal dataset (systems serology, cytokines, and cytometry) of vaccinated macaques, we applied Markov fields (MFs), a data-driven approach that explains vaccine efficacy and immune correlations via multivariate network paths, without requiring large numbers of samples (i.e., macaques) relative to multivariate features. We find that integrating multiple data modes with MFs helps remove spurious connections. Finally, we used the MF to predict outcomes of perturbations at various immune nodes, including an experimentally validated B cell depletion that induced network-wide shifts without reducing vaccine protection.