In
computational biology
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
, protein p''K''
a calculations are used to estimate the
p''K''a values of
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
s as they exist within
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respon ...
s. These calculations complement the p''K''
a values reported for amino acids in their free state, and are used frequently within the fields of
molecular modeling
Molecular modelling encompasses all methods, theoretical and computational, used to model or mimic the behaviour of molecules. The methods are used in the fields of computational chemistry, drug design, computational biology and materials scien ...
,
structural bioinformatics
Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromol ...
, and
computational biology
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
.
Amino acid p''K''a values
p''K''a values of amino acid
side chain
In organic chemistry and biochemistry, a side chain is a chemical group that is attached to a core part of the molecule called the "main chain" or backbone. The side chain is a hydrocarbon branching element of a molecule that is attached to a ...
s play an important role in defining the pH-dependent characteristics of a protein. The pH-dependence of the activity displayed by
enzyme
Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different molecule ...
s and the pH-dependence of
protein stability
Protein folding is the physical process by which a protein chain is translated to its native three-dimensional structure, typically a "folded" conformation by which the protein becomes biologically functional. Via an expeditious and reproduci ...
, for example, are properties that are determined by the p''K''
a values of amino acid side chains.
The p''K''
a values of an amino acid side chain in solution is typically inferred from the p''K''
a values of model compounds (compounds that are similar to the side chains of amino acids). See
Amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
for the p''K''
a values of all amino acid side chains inferred in such a way. There are also numerous experimental studies that have yielded such values, for example by use of
NMR spectroscopy
Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a spectroscopic technique to observe local magnetic fields around atomic nuclei. The sample is placed in a magnetic fiel ...
.
The table below lists the model p''K''
a values that are often used in a protein p''K''
a calculation, and contains a third column based on protein studies.
[Hass and Mulder (2015) ''Annu. Rev. Biophys.'' vol 44 pp. 53–7]
doi 10.1146/annurev-biophys-083012-130351
The effect of the protein environment
When a protein folds, the titratable amino acids in the protein are transferred from a solution-like environment to an environment determined by the 3-dimensional structure of the protein. For example, in an unfolded protein an aspartic acid typically is in an environment which exposes the titratable side chain to water. When the protein folds the aspartic acid could find itself buried deep in the protein interior with no exposure to solvent.
Furthermore, in the folded protein the aspartic acid will be closer to other titratable groups in the protein and will also interact with permanent charges (e.g. ions) and dipoles in the protein.
All of these effects alter the p''K''
a value of the amino acid side chain, and p''K''
a calculation methods generally calculate the effect of the protein environment on the model p''K''
a value of an amino acid side chain.
[Bashford (2004) ''Front Biosci.'' vol. 9 pp. 1082–9]
doi 10.2741/1187
/ref>[Gunner et al. (2006) ''Biochim. Biophys. Acta'' vol. 1757 (8) pp. 942–6]
doi 10.1016/j.bbabio.2006.06.005
/ref>[Ullmann et al. (2008) ''Photosynth. Res.'' 97 vol. 112 pp. 33–5]
doi 10.1007/s11120-008-9306-1
/ref>[Antosiewicz et al. (2011) ''Mol. BioSyst.'' vol. 7 pp. 2923–294]
doi 10.1039/C1MB05170A
/ref>
Typically the effects of the protein environment on the amino acid p''K''a value are divided into pH-independent effects and pH-dependent effects. The pH-independent effects (desolvation, interactions with permanent charges and dipoles) are added to the model p''K''a value to give the intrinsic p''K''a value. The pH-dependent effects cannot be added in the same straightforward way and have to be accounted for using Boltzmann summation, Tanford–Roxby iterations or other methods.
The interplay of the intrinsic p''K''a values of a system with the electrostatic interaction energies between titratable groups can produce quite spectacular effects such as non-Henderson–Hasselbalch titration curve
Titrations are often recorded on graphs called titration curves, which generally contain the volume of the titrant as the independent variable and the pH of the solution as the dependent variable (because it changes depending on the composition o ...
s and even back-titration effects.[A. Onufriev, D.A. Case and G. M. Ullmann (2001). ''Biochemistry'' 40: 3413–341]
doi 10.1021/bi002740q
/ref>
The image below shows a theoretical system consisting of three acidic residues. One group is displaying a back-titration event (blue group).
p''K''a calculation methods
Several software packages and webserver are available for the calculation of protein p''K''a values. See links below o
this table
Using the Poisson–Boltzmann equation
Some methods are based on solutions to the Poisson–Boltzmann equation
The Poisson–Boltzmann equation is a useful equation in many settings, whether it be to understand physiology, physiological interfaces, polymer science, electron interactions in a semiconductor, or more. It aims to describe the distribution of th ...
(PBE), often referred to as FDPB-based methods (''FDPB'' is for "finite difference
A finite difference is a mathematical expression of the form . If a finite difference is divided by , one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for t ...
Poisson–Boltzmann"). The PBE is a modification of Poisson's equation
Poisson's equation is an elliptic partial differential equation of broad utility in theoretical physics. For example, the solution to Poisson's equation is the potential field caused by a given electric charge or mass density distribution; with t ...
that incorporates a description of the effect of solvent ions on the electrostatic field around a molecule.
Th
H++ web server
th
pKD webserver
MCCE
Karlsberg+
PETIT
an
use the FDPB method to compute p''K''a values of amino acid side chains.
FDPB-based methods calculate the change in the p''K''a value of an amino acid side chain when that side chain is moved from a hypothetical fully solvated state to its position in the protein. To perform such a calculation, one needs theoretical methods that can calculate the effect of the protein interior on a p''K''a value, and knowledge of the pKa values of amino acid side chains in their fully solvated states.
Empirical methods
A set of empirical rules relating the protein structure to the p''K''a values of ionizable residues have been developed b
Li, Robertson, and Jensen
These rules form the basis for th
web-accessible
program called PROPKA for rapid predictions of p''K''a values.
A recent empirical p''K''a prediction program was released b
Tan KP ''et.al.''
with the online serve
DEPTH web server
Molecular dynamics (MD)-based methods
Molecular dynamics
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...
methods of calculating p''K''a values make it possible to include full flexibility of the titrated molecule.[Donnini et al. (2011) ''J. Chem. Theory Comp.'' vol 7 pp. 1962–7]
doi 10.1021/ct200061r
[Wallace et al. (2011) ''J. Chem. Theory Comp.'' vol 7 pp. 2617–262]
doi 10.1021/ct200146j
[Goh et al. (2012) ''J. Chem. Theory Comp.'' vol 8 pp. 36–4]
doi 10.1021/ct2006314
Molecular dynamics based methods are typically much more computationally expensive, and not necessarily more accurate, ways to predict p''K''a values than approaches based on the Poisson–Boltzmann equation
The Poisson–Boltzmann equation is a useful equation in many settings, whether it be to understand physiology, physiological interfaces, polymer science, electron interactions in a semiconductor, or more. It aims to describe the distribution of th ...
. Limited conformational flexibility can also be realized within a continuum electrostatics approach, e.g., for considering multiple amino acid sidechain rotamers. In addition, current commonly used molecular force fields do not take electronic polarizability into account, which could be an important property in determining protonation energies.
Determining p''K''a values from titration curves or free energy calculations
From the titration
Titration (also known as titrimetry and volumetric analysis) is a common laboratory method of quantitative chemical analysis to determine the concentration of an identified analyte (a substance to be analyzed). A reagent, termed the ''titrant ...
of protonatable group, one can read the so-called p''K''a which is equal to the pH value where the group is half-protonated. The p''K''a is equal to the Henderson–Hasselbalch p''K''a (p''K'')
if the titration curve follows the Henderson–Hasselbalch equation
In chemistry and biochemistry, the Henderson–Hasselbalch equation
:\ce = \ceK_\ce + \log_ \left( \frac \right)
relates the pH of a chemical solution of a weak acid to the numerical value of the acid dissociation constant, ''K''a, of acid a ...
.[Ullmann (2003) ''J. Phys. Chem. B'' vol 107 pp. 1263–7]
doi 10.1021/jp026454v
Most p''K''a calculation methods silently assume that all titration curves are Henderson–Hasselbalch shaped, and p''K''a values in p''K''a calculation programs are therefore often determined in this way. In the general case of multiple interacting protonatable sites, the p''K''a value is not thermodynamically meaningful. In contrast, the Henderson–Hasselbalch p''K''a value can be computed from the protonation free energy via
and is thus in turn related to the protonation free energy of the site via
.
The protonation free energy can in principle be computed from the protonation probability of the group (pH) which can be read from its titration curve
Titration curves can be computed within a continuum electrostatics approach with formally exact but more elaborate analytical or Monte Carlo (MC) methods, or inexact but fast approximate methods. MC methods that have been used to compute titration curves[Ullmann et al. (2012) ''J. Comput. Chem.'' vol 33 pp. 887–90]
doi 10.1002/jcc.22919
/ref> are Metropolis MC[Metropolis et al. (1953) ''J. Chem. Phys.'' vol 23 pp. 1087–109]
doi 10.1063/1.1699114
/ref>[Beroza et al. (1991) ''Proc. Natl. Acad. Sci. USA'' vol 88 pp. 5804–580]
doi 10.1073/pnas.88.13.5804
/ref> or Wang–Landau MC.[Wang and Landau (2001) Phys. Rev. E vol 64 pp 05610]
doi 10.1103/PhysRevE.64.056101
/ref> Approximate methods that use a mean-field approach for computing titration curves are the Tanford–Roxby method and hybrids of this method that combine an exact statistical mechanics treatment within clusters of strongly interacting sites with a mean-field treatment of intercluster interactions.[Tanford and Roxby (1972) ''Biochemistry'' vol 11 pp. 2192–219]
doi 10.1021/bi00761a029
/ref>[Bashford and Karplus (1991) ''J. Phys. Chem.'' vol 95 pp. 9556–6]
doi 10.1021/j100176a093
/ref>[Gilson (1993) ''Proteins'' vol 15 pp. 266–8]
doi 10.1002/prot.340150305
/ref>[Antosiewicz et al. (1994) ''J. Mol. Biol.'' vol 238 pp. 415–3]
doi 10.1006/jmbi.1994.1301
/ref>[Spassov and Bashford (1999) ''J. Comput. Chem.'' vol 20 pp. 1091–111]
doi 10.1002/(SICI)1096-987X(199908)20:11<1091::AID-JCC1>3.0.CO;2-3
/ref>
In practice, it can be difficult to obtain statistically converged and accurate protonation free energies from titration curves if is close to a value of 1 or 0. In this case, one can use various free energy calculation methods to obtain the protonation free energy such as biased Metropolis MC,[Beroza et al. (1995) ''Biophys. J.'' vol 68 pp. 2233–225]
doi 10.1016/S0006-3495(95)80406-6
/ref> free-energy perturbation,[Zwanzig (1954) ''J. Chem. Phys.'' vol 22 pp. 1420–142]
doi 10.1063/1.1740409
/ref>[Ullmann et al. 2011 ''J. Phys. Chem. B.'' vol 68 pp. 507–52]
doi 10.1021/jp1093838
/ref> thermodynamic integration Thermodynamic integration is a method used to compare the difference in free energy between two given states (e.g., A and B) whose potential energies U_A and U_B have different dependences on the spatial coordinates. Because the free energy of a ...
,[Kirkwood (1935) ''J. Chem. Phys.'' vol 2 pp. 300–31]
doi 10.1063/1.1749657
/ref>[Bruckner and Boresch (2011) ''J. Comput. Chem.'' vol 32 pp. 1303–131]
doi 10.1002/jcc.21713
/ref>[Bruckner and Boresch (2011) ''J. Comput. Chem.'' vol 32 pp. 1320–133]
doi 10.1002/jcc.21712
/ref> the non-equilibrium work method[Jarzynski (1997) ''Phys. Rev. E'' vol pp. 2233–225]
doi 10.1103/PhysRevE.56.5018
/ref> or the Bennett acceptance ratio
The Bennett acceptance ratio method (BAR) is an algorithm for estimating the difference in free energy between two systems (usually the systems will be simulated on the computer).
It was suggested by Charles H. Bennett in 1976.
Preliminaries
Tak ...
method.[Bennett (1976) ''J. Comput. Phys.'' vol 22 pp. 245–26]
doi 10.1016/0021-9991(76)90078-4
/ref>
Note that the p''K'' value does in general depend on the pH value.[Bombarda et al. (2010) ''J. Phys. Chem. B'' vol 114 pp. 1994–200]
doi 10.1021/jp908926w
This dependence is small for weakly interacting groups like well solvated amino acid sidechains on the protein surface, but can be large for strongly interacting groups like those buried in enzyme active sites or integral membrane proteins.[Bashford and Gerwert (1992) ''J. Mol. Biol.'' vol 224 pp. 473–8]
doi 10.1016/0022-2836(92)91009-E
/ref>[Spassov et al. (2001) ''J. Mol. Biol.'' vol 312 pp. 203–1]
doi 10.1006/jmbi.2001.4902
/ref>[Ullmann et al. (2011) ''J. Phys. Chem. B'' vol 115 pp. 10346–5]
doi 10.1021/jp204644h
/ref>
References
{{Reflist
Software for protein p''K''a calculations
AccelrysPKA
Accelrys CHARMm based p''K''a calculation
H++
Poisson–Boltzmann based p''K''a calculations
MCCE2
Multi-Conformation Continuum Electrostatics (Version 2)
Karlsberg+
p''K''a computation with multiple pH adapted conformations
PETIT
Proton and Electron TITration
Generalized Monte Carlo Titration
DEPTH web server
Empirical calculation of p''K''a values using Residue Depth as a major feature
Protein methods
Equilibrium chemistry