A computational model of ESAT-6 complex in membrane
One quarter of the world's population are infected by Mycobacterium tuberculosis (Mtb), which is a leading death-causing bacterial pathogen. Recent evidence has demonstrated that two virulence factors, ESAT-6 and CFP-10, play crucial roles in Mtb's cytosolic translocation. Many efforts have been made to study the ESAT-6 and CFP-10 proteins. Some studies have shown that ESAT-6 has an essential role in rupturing phagosome. However, the mechanisms of how ESAT-6 interacts with the membrane have not yet been fully understood. Recent studies indicate that the ESAT-6 disassociates with CFP-10 upon their interaction with phagosome membrane, forming a membrane-spanning pore. Based on these observations, as well as the available structure of ESAT-6, ESAT-6 is hypothesized to form an oligomer for membrane insertion as well as rupturing. Such an ESAT-6 oligomer may play a significant role in the tuberculosis infection. Therefore, deeper understanding of the oligomerization of ESAT-6 will establish new directions for tuberculosis treatment. However, the structure of the oligomer of ESAT-6 is not known. Here, we proposed a comprehensive approach to model the complex structures of ESAT-6 oligomer inside a membrane. Several computational tools, including MD simulation, symmetrical docking, MM/PBSA, are used to obtain and characterize such a complex structure. Results from our studies lead to a well-supported hypothesis of the ESAT-6 oligomerization as well as the identification of essential residues in stabilizing the ESAT-6 oligomer which provide useful insights for future drug design targeting tuberculosis. The approach in this research can also be used to model and study other cross-membrane complex structures.
Predicting mucin-type O-Glycosylation using enhancement value products from derived protein features
Mucin-type O-glycosylation is one of the most common post-translational modifications of proteins. This glycosylation is initiated in the Golgi by the addition of the sugar N-acetylgalactosamine (GalNAc) onto protein Ser and Thr residues by a family of polypeptide GalNAc transferases. In humans there are 20 isoforms that are differentially expressed across tissues that serve multiple important biological roles. Using random peptide substrates, isoform specific amino acid preferences have been obtained in the form of enhancement values (EV). These EVs alone have previously been used to predict O-glycosylation sites via the web based ISOGlyP (Isoform Specific O-Glycosylation Prediction) tool. Here we explore additional protein features to determine whether these can complement the random peptide derived enhancement values and increase the predictive power of ISOGlyP. The inclusion of additional protein substrate features (such as secondary structure and surface accessibility) was found to increase sensitivity with minimal loss of specificity, when tested with three different published O-glycoproteomics data sets, thus increasing the overall accuracy of the ISOGlyP predictions.
Accuracy of continuum electrostatic calculations based on three common dielectric boundary definitions
We investigate the influence of three common definitions of the solute/solvent dielectric boundary (DB) on the accuracy of the electrostatic solvation energy Δ computed within the Poisson Boltzmann and the generalized Born models of implicit solvation. The test structures include small molecules, peptides and small proteins; explicit solvent Δ are used as accuracy reference. For common atomic radii sets BONDI, PARSE (and ZAP9 for small molecules) the use of van der Waals (vdW) DB results, on average, in considerably larger errors in Δ than the molecular surface (MS) DB. The optimal probe radius for which the MS DB yields the most accurate Δ varies considerably between structure types. The solvent accessible surface (SAS) DB becomes optimal at ~ 0.2 Å (exact value is sensitive to the structure and atomic radii), at which point the average accuracy of Δ is comparable to that of the MS-based boundary. The geometric equivalence of SAS to vdW surface based on the same atomic radii uniformly increased by gives the corresponding optimal vdW DB. For small molecules, the optimal vdW DB based on BONDI + 0.2 Å radii can yield Δ estimates at least as accurate as those based on the optimal MS DB. Also, in small molecules, pairwise charge-charge interactions computed with the optimal vdW DB are virtually equal to those computed with the MS DB, suggesting that in this case the two boundaries are practically equivalent by the electrostatic energy criteria. In structures other than small molecules, the optimal vdW and MS dielectric boundaries are not equivalent: the respective pairwise electrostatic interactions in the presence of solvent can differ by up to 5 kcal/mol for individual atomic pairs in small proteins, even when the total Δ are equal. For small proteins, the average decrease in pairwise electrostatic interactions resulting from the switch from optimal MS to optimal vdW DB definition can be mimicked within the MS DB definition by doubling of the solute dielectric constant. However, the use of the higher interior dielectric does not eliminate the large individual deviations between pairwise interactions computed within the two DB definitions. It is argued that while the MS based definition of the dielectric boundary is more physically correct in some types of practical calculations, the choice is not so clear in some other common scenarios.
On the Modeling of Polar Component of Solvation Energy using Smooth Gaussian-Based Dielectric Function
Traditional implicit methods for modeling electrostatics in biomolecules use a two-dielectric approach: a biomolecule is assigned low dielectric constant while the water phase is considered as a high dielectric constant medium. However, such an approach treats the biomolecule-water interface as a sharp dielectric border between two homogeneous dielectric media and does not account for inhomogeneous dielectric properties of the macromolecule as well. Recently we reported a new development, a smooth Gaussian-based dielectric function which treats the entire system, the solute and the water phase, as inhomogeneous dielectric medium (J Chem Theory Comput. 2013 Apr 9; 9(4): 2126-2136.). Here we examine various aspects of the modeling of polar solvation energy in such inhomogeneous systems in terms of the solute-water boundary and the inhomogeneity of the solute in the absence of water surrounding. The smooth Gaussian-based dielectric function is implemented in the DelPhi finite-difference program, and therefore the sensitivity of the results with respect to the grid parameters is investigated, and it is shown that the calculated polar solvation energy is almost grid independent. Furthermore, the results are compared with the standard two-media model and it is demonstrated that on average, the standard method overestimates the magnitude of the polar solvation energy by a factor 2.5. Lastly, the possibility of the solute to have local dielectric constant larger than of a bulk water is investigated in a benchmarking test against experimentally determined set of pKa's and it is speculated that side chain rearrangements could result in local dielectric constant larger than 80.
Distinct mechanisms of a phosphotyrosyl peptide binding to two SH2 domains
Protein phosphorylation is very common post-translational modification, catalyzed by kinases, for signaling and regulation. Phosphotyrosines frequently target SH2 domains. The spleen tyrosine kinase (Syk) is critical for tyrosine phosphorylation of multiple proteins and for regulation of important pathways. Phosphorylation of both Y342 and Y346 in Syk linker B is required for optimal signaling. The SH2 domains of Vav1 and PLC-γ both bind this doubly phosphorylated motif. Here we used a recently developed method to calculate the effects of Y342 and Y346 phosphorylation on the rate constants of a peptide from Syk linker B binding to the SH2 domains of Vav1 and PLC-γ. The predicted effects agree well with experimental observations. Moreover, we found that the same doubly phosphorylated peptide binds the two SH2 domains via distinct mechanisms, with apparent rigid docking for Vav1 SH2 and dock-and-coalesce for PLC-γ SH2.
Multiscale Multiphysics and Multidomain Models I: Basic Theory
This work extends our earlier two-domain formulation of a differential geometry based multiscale paradigm into a multidomain theory, which endows us the ability to simultaneously accommodate multiphysical descriptions of aqueous chemical, physical and biological systems, such as fuel cells, solar cells, nanofluidics, ion channels, viruses, RNA polymerases, molecular motors and large macromolecular complexes. The essential idea is to make use of the differential geometry theory of surfaces as a natural means to geometrically separate the macroscopic domain of solvent from the microscopic domain of solute, and dynamically couple continuum and discrete descriptions. Our main strategy is to construct energy functionals to put on an equal footing of multiphysics, including polar (i.e., electrostatic) solvation, nonpolar solvation, chemical potential, quantum mechanics, fluid mechanics, molecular mechanics, coarse grained dynamics and elastic dynamics. The variational principle is applied to the energy functionals to derive desirable governing equations, such as multidomain Laplace-Beltrami (LB) equations for macromolecular morphologies, multidomain Poisson-Boltzmann (PB) equation or Poisson equation for electrostatic potential, generalized Nernst-Planck (NP) equations for the dynamics of charged solvent species, generalized Navier-Stokes (NS) equation for fluid dynamics, generalized Newton's equations for molecular dynamics (MD) or coarse-grained dynamics and equation of motion for elastic dynamics. Unlike the classical PB equation, our PB equation is an integral-differential equation due to solvent-solute interactions. To illustrate the proposed formalism, we have explicitly constructed three models, a multidomain solvation model, a multidomain charge transport model and a multidomain chemo-electro-fluid-MD-elastic model. Each solute domain is equipped with distinct surface tension, pressure, dielectric function, and charge density distribution. In addition to long-range Coulombic interactions, various non-electrostatic solvent-solute interactions are considered in the present modeling. We demonstrate the consistency between the non-equilibrium charge transport model and the equilibrium solvation model by showing the systematical reduction of the former to the latter at equilibrium. This paper also offers a brief review of the field.
OPTIMIZATION BIAS IN ENERGY-BASED STRUCTURE PREDICTION
Physics-based computational approaches to predicting the structure of macromolecules such as proteins are gaining increased use, but there are remaining challenges. In the current work, it is demonstrated that in energy-based prediction methods, the degree of optimization of the sampled structures can influence the prediction results. In particular, discrepancies in the degree of local sampling can bias the predictions in favor of the oversampled structures by shifting the local probability distributions of the minimum sampled energies. In simple systems, it is shown that the magnitude of the errors can be calculated from the energy surface, and for certain model systems, derived analytically. Further, it is shown that for energy wells whose forms differ only by a randomly assigned energy shift, the optimal accuracy of prediction is achieved when the sampling around each structure is equal. Energy correction terms can be used in cases of unequal sampling to reproduce the total probabilities that would occur under equal sampling, but optimal corrections only partially restore the prediction accuracy lost to unequal sampling. For multiwell systems, the determination of the correction terms is a multibody problem; it is shown that the involved cross-correlation multiple integrals can be reduced to simpler integrals. The possible implications of the current analysis for macromolecular structure prediction are discussed.
Computational Characterization of Mutations in Cardiac Troponin T Known to Cause Familial Hypertrophic Cardiomyopathy
Cardiac Troponin T (cTnT) is a central modulator of thin filament regulation of myofilament activation. The lack of structural data for the TNT1 tail domain, a proposed α-helical region, makes the functional implications of the FHC mutations difficult to determine. Studies have suggested that flexibility of TNT1 is important in normal protein-protein interactions within the thin filament. Our groups have previously shown through Molecular Dynamics (MD) simulations that some FHC mutations, Arg92Leu(R92L) and Arg92Trp(R92W), result in increased flexibility at a critical hinge region 12 residues distant from the mutation. To explain this distant effect and its implications for FHC mutations, we characterized the dynamics of wild type and mutational segments of cTnT using MD. Our data shows an opening of the helix between residues 105-110 in mutants. Consequently, the dihedral angles of these residues correspond to non-α-helical regions on Ramachandran plots. We hypothesize the removal of a charged residue decreases electrostatic repulsion between the point mutation and surrounding residues resulting in local helical compaction. Constrained ends of the helix and localized compaction results in expansion within the nearest non-polar helical turn from the mutation site, residues 105-109.
On the electrostatic properties of homodimeric proteins
A large fraction of proteins function as homodimers, but it is not always clear why the dimerization is important for functionality since frequently each monomer possesses a distinctive active site. Recent work (PLoS Computational Biology, 9(2), e1002924) indicates that homodimerization may be important for forming an electrostatic funnel in the spermine synthase homodimer which guides changed substrates toward the active centers. This prompted us to investigate the electrostatic properties of a large set of homodimeric proteins and resulted in an observation that in a vast majority of the cases the dimerization indeed results in specific electrostatic features, although not necessarily in an electrostatic funnel. It is demonstrated that the electrostatic dipole moment of the dimer is predominantly perpendicular to the axis connecting the centers of the mass of the monomers. In addition, the surface points with highest potential are located in the proximity of the interfacial plane of the homodimeric complexes. These findings indicate that frequently homodimerization provides specific electrostatic features needed for the function of proteins.