HOME

TheInfoList



OR:

In
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combine ...
, the template modeling score or TM-score is a measure of similarity between two
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s. The TM-score is intended as a more accurate measure of the global similarity of full-length protein structures than the often used
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
measure. The TM-score indicates the similarity between two structures by a score between (0,1], where 1 indicates a perfect match between two structures (thus the higher the better). Generally scores below 0.20 corresponds to randomly chosen unrelated proteins whereas structures with a score higher than 0.5 assume roughly the same fold. A quantitative study shows that proteins of TM-score = 0.5 have a
posterior probability The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
of 37% in the same
CATH The CATH Protein Structure Classification database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It was created in the mid-1990s by Professor Christine Orengo and coll ...
topology family and of 13% in the same
SCOP A ( or ) was a poet as represented in Old English poetry. The scop is the Old English counterpart of the Old Norse ', with the important difference that "skald" was applied to historical persons, and scop is used, for the most part, to designa ...
fold family. The probabilities increase rapidly when TM-score > 0.5. The TM-score is designed to be independent of protein lengths.


The TM-score equation

TM-score between two protein structures (e.g., a template structure and a target structure) is defined by :\text=\max\left \frac\sum_i^\frac \right/math> where L_\text is the length of the amino acid sequence of the target protein, and L_\text is the number of residues that appear in both the template and target structures. d_i is the distance between the ith pair of residues in the template and target structures, and d_0(L_\text)=1.24\sqrt 1.8 is a distance scale that normalizes distances. When comparing two protein structures that have the same residue order, L_\text reads from the C-alpha order number of the structure files (i.e., Column 23-26 in
Protein Data Bank (file format) The Protein Data Bank (PDB) file format is a textual file format describing the three-dimensional structures of molecules held in the Protein Data Bank. The PDB format accordingly provides for description and annotation of protein and nucleic acid ...
). When comparing two protein structures that have different sequences and/or different residue orders, a structural alignment is usually performed first, and TM-score is then calculated on the commonly aligned residues from the structural alignment.


Other measures

An often used structural similarity measure is
root-mean-square deviation The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
(RMSD). Because RMSD =\sqrt is calculated as an average of distance error (d_i) with equal weight over all residue pairs, a large local error on a few residue pairs can result in a quite large RMSD. On the other hand, by putting d_i in the denominator, TM-score naturally weights smaller distance errors more strongly than larger distance errors. Therefore, TM-score value is more sensitive to the global structural similarity rather than to the local structural errors, compared to RMSD. Another advantage of TM-score is the introduction of the scale d_0(L_\text)=1.24\sqrt 1.8 which makes the magnitude of TM-score length-independent for random structure pairs, while RMSD and most other measures are length-dependent metrics. The Global Distance Test (GDT) algorithm, and its GDT TS score to represent "total score", is another measure of similarity between two
protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer ma ...
s with known amino acid correspondences (e.g. identical
amino acid sequence Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthes ...
s) but different
tertiary structure Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may inte ...
s. GDT score has the same length-dependence issue as RMSD, because the average GDT score for random structure pairs has a power-law dependence on the protein size.


See also

*
RMSD The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed. The RMSD represents ...
— a different structure comparison measure * GDT — a different structure comparison measure * Longest continuous segment (LCS) — A different structure comparison measure * Global distance calculation (GDC_sc, GDC_all) — Structure comparison measures that use full-model information (not just α-carbon) to assess similarity * Local global alignment (LGA) — Protein structure alignment program and structure comparison measure


References

{{reflist


External links


TM-score webserver
— by the Yang Zhang research group. Calculates TM-score and supplies source code.

services and documentation on structure comparison and similarity measures. Bioinformatics Computational chemistry