HOME

TheInfoList



OR:

Resolution by Proxy (ResProx) is a method for assessing the equivalent
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution of
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
-derived protein structures. ResProx calculates resolution from
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
data rather than from electron density or other experimental inputs. This makes it possible to calculate the resolution of a structure regardless of how it was solved (
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
,
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
, EM, modeling, ab initio prediction). ResProx was originally designed to serve as a simple, single-number evaluation that allows straightforward comparison between the quality/resolution of
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
structures and the quality of a given
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structure. However, it can also be used to assess the reliability of an experimentally reported
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
structure resolution, to evaluate
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s solved by unconventional or hybrid means and to identify fraudulent structures deposited in the PDB. ResProx incorporates more than 25 different structural features to determine a single resolution-like value. ResProx values are reported in Angstroms. Tests on thousands of
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
structures show that ResProx values match very closely to resolution values reported by
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
crystallographers. Resolution-by-proxy values can be calculated for newly determined
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s using a freely accessible ResProx web server. This server accepts protein
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
data (in PDB format) and generates a resolution estimate (in Angstroms) for that input structure.


Background and Rationale

In
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
, resolution is a measure of the resolvability or precision in the electron density map of a molecule. Resolution is usually reported in
Angstroms The angstromEntry "angstrom" in the Oxford online dictionary. Retrieved on 2019-03-02 from https://en.oxforddictionaries.com/definition/angstrom.Entry "angstrom" in the Merriam-Webster online dictionary. Retrieved on 2019-03-02 from https://www.m ...
(Å, 10–10 meters) for
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
crystal structures. The smaller the number, the better the degree of atomic resolution. In protein
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
the best resolution typically attainable is about 1 Å. This level of resolution allows individual
hydrogen atoms A hydrogen atom is an atom of the chemical element hydrogen. The Electric charge, electrically neutral atom contains a single positively charged proton and a single negatively charged electron bound to the nucleus by the Coulomb force. Atomic ...
to be visualized and heavy atoms ( C, O, N) to be very accurately mapped. Most
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s solved today have a resolution of 1.5 to 2.5 Å, which means the
hydrogen atoms A hydrogen atom is an atom of the chemical element hydrogen. The Electric charge, electrically neutral atom contains a single positively charged proton and a single negatively charged electron bound to the nucleus by the Coulomb force. Atomic ...
are not visible and there is some uncertainty in the precise location of the heavy atoms.
Protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s with a resolution of >2.5 Å generally have a number of
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
inaccuracies as well as other structural problems. When the resolution is greater than 3.5 Å, there is often considerable uncertainty in both the atom locations and even the identity of individual amino residues. In other words, resolution is inversely correlated with structure quality (i.e. higher numbers mean poorer structures). This trend in
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
quality for
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution matches very closely to the trend seen the quality of
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
-determined
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s. Some
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures have large numbers of constraints (NOEs, H-bonds, J-couplings, dipolar couplings), excellent geometry, high structure quality and very tight ensembles with excellent atomic precision (
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
s < 1 Å). Other
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures have very few constraints, poor geometry or poor structure quality and very loose ensembles (
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
s > 3 Å). However, there is no simple mapping between
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
values and
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution values. That is, an
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
ensemble with 1 Å
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
does not correspond in quality or precision to an
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
structure with 1 Å resolution. This is because the
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
measure is both a function of the number of structures used in the ensemble and the selection bias of the spectroscopist who deposits the structural ensemble. Likewise, in
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
it is possible to generate high quality, precisely determined
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s using relatively few, well-chosen constraints. It is also possible to generate very low quality
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures from large numbers of carelessly assessed, mistaken or mis-assigned constraints. Over the past 20 years several methods have been proposed to calculate “equivalent resolution” using only
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
data (rather than
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
diffraction data). Some were designed specifically for evaluating
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures such as Procheck-NMR while others were designed more for structure quality evaluation and validation of
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
structures such as MolProbity, and RosettaHoles2. However, these methods rely on a relatively small number of
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
quality measures to predict resolution (4, 3, and 1 measures, respectively) and consequently the correlation between observed (X-ray) resolution and the predicted resolution is not particularly good. By expanding the number of structure features to include the distribution of torsion angles, the presence of atom clashes, the normality of hydrogen bonding, the numbers of violations of bond lengths and bond angles, the presence of cavities, residue-specific packing volumes, packing efficiency and threading energies it is possible to improve this correlation quite substantially.


The ResProx Algorithm

ResProx uses a collection of 25 different
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
features (such as torsion angle distributions, hydrogen bonding, packing volume, cavities, Molprobity measures) that were used in a Support Vector Regression method to maximize the correlation between the predicted resolution and the observed
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution on a set of 2400
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s with known
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution. The exact details of the algorithm are provided in a paper published by Wishart and colleagues. After training and appropriate validation on independent tests sets, this SVR model is able to estimate the resolution of solved
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
structures with a correlation coefficient of 0.92,
mean absolute error In statistics, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of ''Y'' versus ''X'' include comparisons of predicted versus observed, subsequent time versus initial time, and ...
of 0.28 Å. This is about 15-30% better than existing methods. This is shown in Figure 1. Because the performance of the ResProx method is so high and because it only needs
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
data to generate an estimate of the equivalent
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution, it is ideally suited to be applied to
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures. When
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures are analyzed by ResProx, the average
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structure has an equivalent
X-ray An X-ray, or, much less commonly, X-radiation, is a penetrating form of high-energy electromagnetic radiation. Most X-rays have a wavelength ranging from 10  picometers to 10  nanometers, corresponding to frequencies in the range 30&nb ...
resolution of 2.8 Å, which is relatively poor (Fig. 2). This is in agreement with qualitative observations regarding the overall quality and precision of
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures. As seen in Figure 2, a very small number
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structures exhibit a resolution equivalent to < 1.0 Å, but these are rare. Figure 1. Performance of ResProx against training and testing data. Figure 2. Histogram of ResProx equivalent resolution for NMR models and experimental resolution for X-ray structures. 500 NMR ensembles and 500 X-ray structures were randomly selected from the PDB. Proteins were grouped in 0.25Å resolution bins. Resolution values on the X-axis indicate the upper limit of each resolution bin. Values for NMR structures and X-ray structures represent the number of structures in each resolution bin.


The ResProx Server

The ResProx web server a freely accessible server that accepts
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
protein
coordinate In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is sign ...
data (in PDB format) and generates a resolution estimate (in Angstroms) for that
NMR Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are perturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
structure. A downloadable version of ResProx is also available. ResProx also provides a list of List of 50834 protein structures with PDB identifiers along with their observed resolution and corresponding ResProx values.


References

{{Reflist Biological databases