Protein Subcellular Localization Prediction
   HOME

TheInfoList



OR:

Protein subcellular localization prediction (or just protein localization prediction) involves the prediction of where a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
resides in a
cell Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery ...
, its
subcellular localization The cells of eukaryotic organisms are elaborately subdivided into functionally-distinct membrane-bound compartments. Some major constituents of eukaryotic cells are: extracellular space, plasma membrane, cytoplasm, nucleus, mitochondria, Golgi ap ...
. In general, prediction tools take as input information about a protein, such as a
protein sequence Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthes ...
of
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
s, and produce a predicted location within the cell as output, such as the
nucleus Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to: *Atomic nucleus, the very dense central region of an atom * Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA Nucl ...
,
Endoplasmic reticulum The endoplasmic reticulum (ER) is, in essence, the transportation system of the eukaryotic cell, and has many other important functions such as protein folding. It is a type of organelle made up of two subunits – rough endoplasmic reticulum ( ...
,
Golgi apparatus The Golgi apparatus (), also known as the Golgi complex, Golgi body, or simply the Golgi, is an organelle found in most eukaryotic cells. Part of the endomembrane system in the cytoplasm, it packages proteins into membrane-bound vesicles ins ...
,
extracellular space Extracellular space refers to the part of a multicellular organism outside the cells, usually taken to be outside the plasma membranes, and occupied by fluid. This is distinguished from intracellular space, which is inside the cells. The compos ...
, or other
organelle In cell biology, an organelle is a specialized subunit, usually within a cell, that has a specific function. The name ''organelle'' comes from the idea that these structures are parts of cells, as organs are to the body, hence ''organelle,'' the ...
s. The aim is to build tools that can accurately predict the outcome of
protein targeting :''This article deals with protein targeting in eukaryotes unless specified otherwise.'' Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations within or outside the ce ...
in cells. Prediction of protein subcellular localization is an important component of
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
based prediction of protein function and
genome annotation DNA annotation or genome annotation is the process of identifying the locations of genes and all of the coding regions in a genome and determining what those genes do. An annotation (irrespective of the context) is a note added by way of explanati ...
, and it can aid the identification of drug targets.


Background

Experimentally determining the
subcellular localization The cells of eukaryotic organisms are elaborately subdivided into functionally-distinct membrane-bound compartments. Some major constituents of eukaryotic cells are: extracellular space, plasma membrane, cytoplasm, nucleus, mitochondria, Golgi ap ...
of a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
can be a laborious and time consuming task.
Immunolabeling Immunolabeling is a biochemical process that enables the detection and localization of an antigen to a particular site within a cell, tissue, or organ. Antigens are organic molecules, usually proteins, capable of binding to an antibody. These an ...
or tagging (such as with a
green fluorescent protein The green fluorescent protein (GFP) is a protein that exhibits bright green fluorescence when exposed to light in the blue to ultraviolet range. The label ''GFP'' traditionally refers to the protein first isolated from the jellyfish ''Aequorea ...
) to view localization using
fluorescence microscope A fluorescence microscope is an optical microscope that uses fluorescence instead of, or in addition to, scattering, reflection, and attenuation or absorption, to study the properties of organic or inorganic substances. "Fluorescence microsc ...
are often used. A high throughput alternative is to use prediction. Through the development of new approaches in computer science, coupled with an increased dataset of proteins of known localization, computational tools can now provide fast and accurate localization predictions for many organisms. This has resulted in subcellular localization prediction becoming one of the challenges being successfully aided by
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
, and
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
. Many prediction methods now exceed the accuracy of some high-throughput laboratory methods for the identification of protein subcellular localization. Particularly, some predictors have been developed that can be used to deal with proteins that may simultaneously exist, or move between, two or more different subcellular locations. Experimental validation is typically required to confirm the predicted localizations.


Tools

In 1999 PSORT was the first published program to predict subcellular localization. Subsequent tools and websites have been released using techniques such as artificial neural networks,
support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratorie ...
and
protein motif In a chain-like biological molecule, such as a protein or nucleic acid, a structural motif is a common three-dimensional structure that appears in a variety of different, evolutionarily unrelated molecules. A structural motif does not have t ...
s. Predictors can be specialized for proteins in different organisms. Some are specialized for eukaryotic proteins, some for human proteins, and some for plant proteins. Methods for the prediction of bacterial localization predictors, and their accuracy, have been reviewed. In 2021, SCLpred-MEM, a membrane protein prediction tool powered by artificial neural networks was published. SCLpred-EMS is another tool powered by Artificial neural networks that classify proteins into endomembrane system and secretory pathway (EMS) versus all others. Similarly, Light-Attention uses machine learning methods to predict ten different common subcellular locations. The development of protein subcellular location prediction has been summarized in two comprehensive review articles. Recent tools and an experience report can be found in a recent paper b
Meinken and Min (2012)


Application

Knowledge of the subcellular localization of a protein can significantly improve target identification during the
drug discovery In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered. Historically, drugs were discovered by identifying the active ingredient from traditional remedies or by ...
process. For example,
secreted protein A secretory protein is any protein, whether it be endocrine or exocrine, which is secreted by a cell. Secretory proteins include many hormones, enzymes, toxins, and antimicrobial peptides. Secretory proteins are synthesized in the endoplasmic ...
s and plasma membrane proteins are easily accessible by drug molecules due to their localization in the extracellular space or on the cell surface. Bacterial cell surface and secreted proteins are also of interest for their potential as vaccine candidates or as diagnostic targets. Aberrant subcellular localization of proteins has been observed in the cells of several diseases, such as
cancer Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
and
Alzheimer's disease Alzheimer's disease (AD) is a neurodegeneration, neurodegenerative disease that usually starts slowly and progressively worsens. It is the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in short-term me ...
. Secreted proteins from some archaea that can survive in unusual environments have industrially important applications. By using prediction a high number of proteins can be assessed in order to find candidates that are trafficked to the desired location.


Databases

The results of subcellular localization prediction can be stored in databases. Examples include the multi-species databas
Compartments
FunSecKB2, a fungal database; PlantSecKB, a plant database; MetazSecKB, an animal and human database; and ProtSecKB, a protist database.


References


Further reading

* * * * * * {{cite journal , vauthors = Chou KC, Shen HB , title = Recent progress in protein subcellular location prediction , journal = Analytical Biochemistry , volume = 370 , issue = 1 , pages = 1–16 , date = Nov 2007 , pmid = 17698024 , doi = 10.1016/j.ab.2007.07.006 Biochemistry detection methods Protein methods Cell biology Computational science Bioinformatics software Protein targeting