Critical Assessment of Structure Prediction (CASP), sometimes called Critical Assessment of Protein Structure Prediction, is a community-wide, worldwide experiment for
protein structure prediction
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different ...
taking place every two years since 1994. CASP provides research groups with an opportunity to objectively test their structure prediction methods and delivers an independent assessment of the state of the art in protein structure modeling to the research community and software users. Even though the primary goal of CASP is to help advance the methods of identifying
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
three-dimensional structure from its amino acid sequence many view the experiment more as a “world championship” in this field of science. More than 100 research groups from all over the world participate in CASP on a regular basis and it is not uncommon for entire groups to suspend their other research for months while they focus on getting their servers ready for the experiment and on performing the detailed predictions.
Selection of target proteins
In order to ensure that no predictor can have prior information about a protein's structure that would put them at an advantage, it is important that the experiment be conducted in a double-blind fashion: Neither predictors nor the organizers and assessors know the structures of the target proteins at the time when predictions are made. Targets for structure prediction are either structures soon-to-be solved by
X-ray crystallography
X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
or NMR spectroscopy, or structures that have just been solved (mainly by one of the
structural genomics centers) and are kept on hold by the
Protein Data Bank
The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, ...
. If the given sequence is found to be related by common descent to a protein sequence of known structure (called a template),
comparative protein modeling may be used to predict the
tertiary structure
Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may i ...
. Templates can be found using
sequence alignment methods (e.g.
BLAST
Blast or The Blast may refer to:
*Explosion, a rapid increase in volume and release of energy in an extreme manner
*Detonation, an exothermic front accelerating through a medium that eventually drives a shock front
Film
* ''Blast'' (1997 film), ...
or
HHsearch) or
protein threading
Protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure.
It differ ...
methods, which are better in finding distantly related templates. Otherwise,
''de novo'' protein structure prediction must be applied (e.g. Rosetta), which is much less reliable but can sometimes yield models with the correct fold (usually, for proteins less than 100-150 amino acids). Truly new folds are becoming quite rare among the targets,
making that category smaller than desirable.
Evaluation
The primary method of evaluation is a comparison of the predicted model
α-carbon
In the nomenclature of organic chemistry, a locant is a term to indicate the position of a functional group or substituent within a molecule.
Numeric locants
The International Union of Pure and Applied Chemistry (IUPAC) recommends the use of ...
positions with those in the target structure. The comparison is shown visually by cumulative plots of distances between pairs of equivalents
α-carbon
In the nomenclature of organic chemistry, a locant is a term to indicate the position of a functional group or substituent within a molecule.
Numeric locants
The International Union of Pure and Applied Chemistry (IUPAC) recommends the use of ...
in the alignment of the model and the structure, such as shown in the figure (a perfect model would stay at zero all the way across), and is assigned a numerical score
GDT-TS (Global Distance Test—Total Score) describing percentage of well-modeled residues in the model with respect to the target.
Free modeling (template-free, or ''de novo'') is also evaluated visually by the assessors, since the numerical scores do not work as well for finding loose resemblances in the most difficult cases. High-accuracy template-based predictions were evaluated in CASP7 by whether they worked for molecular-replacement phasing of the target crystal structure with successes followed up later, and by full-model (not just
α-carbon
In the nomenclature of organic chemistry, a locant is a term to indicate the position of a functional group or substituent within a molecule.
Numeric locants
The International Union of Pure and Applied Chemistry (IUPAC) recommends the use of ...
) model quality and full-model match to the target in CASP8.
Evaluation of the results is carried out in the following prediction categories:
*
tertiary structure
Protein tertiary structure is the three dimensional shape of a protein. The tertiary structure will have a single polypeptide chain "backbone" with one or more protein secondary structures, the protein domains. Amino acid side chains may i ...
prediction (all CASPs)
*
secondary structure prediction (dropped after CASP5)
* prediction of
structure complexes (CASP2 only; a separate experiment—
CAPRI—carries on this subject)
* residue-residue contact prediction (starting CASP4)
*
disordered regions prediction (starting CASP5)
*
domain
Domain may refer to:
Mathematics
*Domain of a function, the set of input values for which the (total) function is defined
**Domain of definition of a partial function
**Natural domain of a partial function
**Domain of holomorphy of a function
* Do ...
boundary prediction (CASP6–CASP8)
*
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-oriente ...
prediction (starting CASP6)
* model quality assessment (starting CASP7)
* model refinement (starting CASP7)
* high-accuracy template-based prediction (starting CASP7)
Tertiary structure prediction category was further subdivided into:
*
homology modeling
Homology modeling, also known as comparative modeling of protein, refers to constructing an atomic-resolution model of the "''target''" protein from its amino acid sequence and an experimental three-dimensional structure of a related homologous pr ...
* fold recognition (also called
protein threading
Protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure.
It differ ...
; note that this naming is incorrect as threading is a method)
* ''de novo'' structure prediction, now referred to as 'New Fold' as many methods apply evaluation, or scoring, functions that are biased by knowledge of native protein structures, such as an artificial neural network.
Starting with CASP7, categories have been redefined to reflect developments in methods. The 'Template based modeling' category includes all former comparative modeling, homologous fold based models and some analogous fold based models. The 'template free modeling (FM)' category includes models of proteins with previously unseen folds and hard analogous fold based models. Due to limited numbers of template free targets (they are quite rare), in 2011 so called CASP ROLL was introduced. This continuous (rolling) CASP experiment aims at more rigorous evaluation of template free prediction methods through assessment of a larger number of targets outside of the regular CASP prediction season. Unlike
LiveBench and
EVA
Eva or EVA may refer to:
* Eva (name), a feminine given name
Arts, entertainment, and media Fictional characters
* Eva (Dynamite Entertainment), a comic book character by Dynamite Entertainment
* Eva (''Devil May Cry''), Dante's mother in t ...
, this experiment is in the blind-prediction spirit of CASP, i.e. all the predictions are made on yet unknown structures.
The CASP results are published in special supplement issues of the scientific journal ''Proteins'', all of which are accessible through the CASP website. A lead article in each of these supplements describes specifics of the experiment
while a closing article evaluates progress in the field.
In December 2018, CASP13 made headlines when it was won by
AlphaFold
AlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure. The program is designed as a deep learning system.
AlphaFold AI software has had two major ve ...
, an
artificial intelligence
Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
program created by
DeepMind. In November 2020, an improved version 2 of
AlphaFold
AlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure. The program is designed as a deep learning system.
AlphaFold AI software has had two major ve ...
won CASP14. According to one of CASP co-founders John Moult, AlphaFold scored around 90 on a 100-point scale of prediction accuracy for moderately difficult protein targets.
‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures
/ref>
See also
* Critical Assessment of Prediction of Interactions Critical Assessment of PRediction of Interactions (CAPRI) is a community-wide experiment in modelling the molecular structure of protein complexes, otherwise known as protein–protein docking.
The CAPRI is an ongoing series of events in which res ...
(CAPRI)
* Critical Assessment of Function Annotation
The Critical Assessment of Functional Annotation (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function. Different algorithms are evaluated by their ability to predict ...
(CAFA)
* Critical Assessment of Genome Interpretation (CAGI)
References
External links
*{{Official website, https://predictioncenter.org/
CASP ROLL
FORCASP Forum
Result ranking
Automated assessments for CASP15 (2022)
Official ranking for servers only
Official ranking for humans and servers
Automated assessments for CASP14 (2020)
Official ranking for servers only
Official ranking for humans and servers
Automated assessments for CASP13 (2018)
Official ranking for servers only
Official ranking for humans and servers
Automated assessments for CASP12 (2016)
Official ranking for servers only
Official ranking for humans and servers
Automated assessments for CASP11 (2014)
Official ranking for servers only (126 targets)
Official ranking for humans and servers (78 targets)
Automated assessments for CASP10 (2012)
Official ranking for servers only (127 targets)
Official ranking for humans and servers (71 targets)
Ranking by Zhang Lab
Automated assessments for CASP9 (2010)
Ranking by Zhang Lab
Ranking by Cheng Lab
Automated assessments for CASP8 (2008)
Official ranking for servers only
Official ranking for humans and servers
Ranking by Zhang Lab
Ranking by Cheng Lab
Automated assessments for CASP7 (2006)
Ranking by Livebench
Ranking by Zhang Lab
Bioinformatics
Computational chemistry