1. Introduction of CASTp Server
Computer Atlas of Surface Topography of Proteins2. Geometric Modeling Principles
For the calculation strategy of CASTp, alpha-shape and discrete-flow methods are applied to the protein binding site, also the measurement of pocket size by the program of CAST by Liang et al. in 1998, then updated by Tian et al. in 2018. Firstly, CAST identifies atoms which form the protein pocket, then calculates the volume and area, identifies the atoms forming the rims of pocket mouth, computes how many mouth openings for each pocket, predict the area and circumference of mouth openings, finally locates cavities and calculate their size. The secondary structures were calculated by DSSP. The single amino acid annotations were fetched from UniProt database, then mapped to PDB structures following residue-level information from SIFTS database.3. Instructions of Protein Pocket Calculation
3.1 Input Protein structures in PDB format, and a probe radius. 3.2 Searching Users can either search for pre-computed result by 4-letter PDB ID, or upload their own PDB file for customized computation. The core algorithm helps in finding the pocket or cavity with capability of housing a solvent, with a default or adjusted diameter. 3.3 Output CASTp identifies all surface pockets, interior cavities and cross channels, provides detailed delineation of all atoms participating in their formation, including the area and volume of pocket or void as well as measurement of numbers of mouth opening of a particular pocket ID by solvent accessible surface model (Richards' surface) and by molecular surface model ( Connolly surface), all calculated analytically. The core algorithm helps in finding the pocket or cavity with capability of housing a solvent with a diameter of 1.4 Ă…. This online tool also supports4. Why CASTp is Useful?
4.1 Protein science, from an amino acid to sequences and structures Proteins are large, complex molecules that playing critical roles to maintain the normal functioning of the human body. They are essential not just for the structure and function, but also the regulation among the body's tissues and organs. Proteins are made up of hundreds of smaller units called amino acids that are attached to one another by peptide bonds, forming a long chain. 4.2 Protein active sites Usually, the active site of a protein locates on its center of action and, the key to its function. The first step is the detection of active sites on the protein surface and an exact description of their features and boundaries. These specifications are vital inputs for subsequent target druggability prediction or target comparison. Most of the algorithms for active site detection are based on geometric modeling or energetic features based calculation. 4.3 The role of protein pockets The shape and properties of the protein surface determine what interactions are possible with ligands and other macromolecules. Pockets are an important yet ambiguous feature of this surface. During drug discovery process, the first step in screening for lead compounds and potential molecules as drugs is usually a selection of the shape of the binding pocket. Shape plays a role in many computational pharmacological methods. Based on existing results, most features important to predicting drug-binding were depended on size and shape of the binding pocket, with the chemical properties of secondary importance. The surface shape is also important for interactions between protein and water. However, defining discrete pockets or possible interaction sites still remains unclear, due to the shape and location of nearby pockets affected promiscuity and diversity of binding sites. Since most pockets are open to solvent, to define the border of a pocket is the primary difficulty. Those closed to solvent we refer to as buried cavities. With the benefit of well-defined extent, area and volume, buried cavities are more straightforward to locate. In contrast, the border of an open pocket defines its mouth and it provides the cut-off for determination of the surface area and volume. Even defining the pocket as a set of residues does not define the volume or the mouth of the pocket. 4.4 Druggability role prediction In pharmaceutical industry, the current priority strategy for target assessment is high-throughput screening (HTS). NMR screenings are applied against large compound datasets. Chemical characteristics of compounds binding against specific targets are measured, so how well the compound sets bind to the chemical space will decide the binding efficiency. Success rates of virtually docking of the drug-like ligands into the active sites of the target proteins would be detected for prioritization, while the most of the active sites located at the pockets. With the benefits of large amount of structural data, computational methods from different perspectives for druggability prediction have been introduced during the last 30 years with positive results, as a vital instrument to accelerate the prediction accessibility. Many candidates have been integrated into drug discovery pipeline already since then.5. New Features in CASTp 3.0
5.1 Pre-computed results for biological assemblies For a lot of proteins deposited in Protein Data Bank, the asymmetric unit might be different from biological unit, which would make the computational result biologically irrelevant. So the new CASTp 3.0 computed the topological features for biological assemblies, overcome the barriers between asymmetric unit and biological assemblies. 5.2 Imprints of negative volumes of topological features In the 1st release of CASTp server in 2006, only geometric and topological features of those surface atoms participated in the formation of protein pockets, cavities, and channels. The new CASTp added the "negative volume" of the space, referred to the space encompassed by the atoms formed these geometric and topological features. 5.3 Comprehensive annotation on single amino-acid polymorphism The latest CASTp integrated protein annotations aligned with the sequence, including the brief feature, positions, description, and reference of the domains, motifs, and single amino-acid polymorphisms. 5.4 Improved user interface & convenient visualization The new CASTp now incorporated 3Dmol.js for structural visualization, made users able to browse, to interact the protein 3D model, and to examine the computational results in latest web-browsers including Chrome, Firefox, Safari, et al. Users can pick their own representation style of the atoms which form each topographic feature, and to edit the colors by their own preferences.References
{{Reflist Bioinformatics Proteomics Structural biology Computational biology