Proteomics is the large-scale study of
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
s.
Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of
muscle tissue
Muscle tissue (or muscular tissue) is soft tissue that makes up the different types of muscles in most animals, and give the ability of muscles to contract. Muscle tissue is formed during embryonic development, in a process known as myogenesis. Mu ...
, enzymatic digestion of food, or synthesis and replication of
DNA. In addition, other kinds of proteins include
antibodies
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
that protect an organism from infection, and
hormones
A hormone (from the Ancient Greek, Greek participle , "setting in motion") is a class of cell signaling, signaling molecules in multicellular organisms that are sent to distant organs by complex biological processes to regulate physiology and beh ...
that send important signals throughout the body.
The
proteome
The proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. ...
is the entire set of proteins produced or modified by an organism or system. Proteomics enables the identification of ever-increasing numbers of proteins. This varies with time and distinct requirements, or stresses, that a cell or organism undergoes.
Proteomics is an interdisciplinary domain that has benefited greatly from the genetic information of various genome projects, including the
Human Genome Project
The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a ...
. It covers the exploration of proteomes from the overall level of protein composition, structure, and activity, and is an important component of
functional genomics
Functional genomics is a field of molecular biology that attempts to describe gene (and protein) functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects (such as genome sequencing ...
.
''Proteomics'' generally denotes the large-scale experimental analysis of proteins and proteomes, but often refers specifically to
protein purification Protein purification is a series of processes intended to isolate one or a few proteins from a complex mixture, usually cells, tissues or whole organisms. Protein purification is vital for the specification of the function, structure and interact ...
and
mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
.
History and etymology
The first studies of proteins that could be regarded as proteomics began in 1975, after the introduction of the two-dimensional gel and mapping of the proteins from the bacterium ''
Escherichia coli
''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus ''Escher ...
''.
''Proteome'' is blend of the words "protein" and "genome". It was coined in 1994 by then-Ph.D student
Marc Wilkins at
Macquarie University, which founded the first dedicated proteomics laboratory in 1995.
Complexity of the problem
After
genomics
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
and
transcriptomics
Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. He ...
, proteomics is the next step in the study of biological systems. It is more complicated than genomics because an organism's genome is more or less constant, whereas proteomes differ from cell to cell and from time to time. Distinct genes are
expressed in different cell types, which means that even the basic set of proteins produced in a cell must be identified.
In the past this phenomenon was assessed by RNA analysis, which was found to lack correlation with protein content. It is now known that
mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA is ...
is not always translated into protein, and the amount of protein produced for a given amount of mRNA depends on the gene it is transcribed from and on the cell's physiological state. Proteomics confirms the presence of the protein and provides a direct measure of its quantity.
Post-translational modifications
Not only does the translation from mRNA cause differences, but many proteins also are subjected to a wide variety of chemical modifications after translation. The most common and widely studied post-translational modifications include phosphorylation and glycosylation. Many of these post-translational modifications are critical to the protein's function.
Phosphorylation
One such modification is
phosphorylation
In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
, which happens to many
enzymes
Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different molecule ...
and structural proteins in the process of
cell signaling
In biology, cell signaling (cell signalling in British English) or cell communication is the ability of a cell to receive, process, and transmit signals with its environment and with itself. Cell signaling is a fundamental property of all cellula ...
. The addition of a phosphate to particular amino acids—most commonly
serine
Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
and
threonine
Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COOâ ...
mediated by serine-threonine
kinase
In biochemistry, a kinase () is an enzyme that catalyzes the transfer of phosphate groups from high-energy, phosphate-donating molecules to specific substrates. This process is known as phosphorylation, where the high-energy ATP molecule don ...
s, or more rarely
tyrosine
-Tyrosine or tyrosine (symbol Tyr or Y) or 4-hydroxyphenylalanine is one of the 20 standard amino acids that are used by cells to synthesize proteins. It is a non-essential amino acid with a polar side group. The word "tyrosine" is from the Gr ...
mediated by tyrosine
kinases
In biochemistry, a kinase () is an enzyme that catalyzes the transfer of phosphate groups from high-energy, phosphate-donating molecules to specific substrates. This process is known as phosphorylation, where the high-energy ATP molecule don ...
—causes a protein to become a target for binding or interacting with a distinct set of other proteins that recognize the phosphorylated domain.
Because protein phosphorylation is one of the most studied protein modifications, many "proteomic" efforts are geared to determining the set of phosphorylated proteins in a particular cell or tissue-type under particular circumstances. This alerts the scientist to the signaling pathways that may be active in that instance.
Ubiquitination
Ubiquitin
Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Fo ...
is a small protein that may be affixed to certain protein substrates by enzymes called
E3 ubiquitin ligase
A ubiquitin ligase (also called an E3 ubiquitin ligase) is a protein that recruits an E2 ubiquitin-conjugating enzyme that has been loaded with ubiquitin, recognizes a protein substrate, and assists or directly catalyzes the transfer of ubiquiti ...
s. Determining which proteins are poly-ubiquitinated helps understand how protein pathways are regulated. This is, therefore, an additional legitimate "proteomic" study. Similarly, once a researcher determines which substrates are ubiquitinated by each ligase, determining the set of ligases expressed in a particular cell type is helpful.
Additional modifications
In addition to
phosphorylation
In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
and
ubiquitination
Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Fo ...
, proteins may be subjected to (among others)
methylation
In the chemical sciences, methylation denotes the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replacing a hydrogen atom. These t ...
,
acetylation
:
In organic chemistry, acetylation is an organic esterification reaction with acetic acid. It introduces an acetyl group into a chemical compound. Such compounds are termed ''acetate esters'' or simply '' acetates''. Deacetylation is the oppo ...
,
glycosylation
Glycosylation is the reaction in which a carbohydrate (or ' glycan'), i.e. a glycosyl donor, is attached to a hydroxyl or other functional group of another molecule (a glycosyl acceptor) in order to form a glycoconjugate. In biology (but not al ...
,
oxidation
Redox (reduction–oxidation, , ) is a type of chemical reaction in which the oxidation states of substrate change. Oxidation is the loss of electrons or an increase in the oxidation state, while reduction is the gain of electrons or a d ...
, and
nitrosylation
Nitrosylation is the general term for covalent incorporation of a nitric oxide "nitrosyl" moiety into another (usually organic) molecule. There are multiple chemical mechanisms by which this can be achieved; including biological enzymes and indust ...
. Some proteins undergo all these modifications, often in time-dependent combinations. This illustrates the potential complexity of studying protein structure and function.
Distinct proteins are made under distinct settings
A cell may make different sets of proteins at different times or under different conditions, for example during
development
Development or developing may refer to:
Arts
*Development hell, when a project is stuck in development
*Filmmaking, development phase, including finance and budgeting
*Development (music), the process thematic material is reshaped
* Photograph ...
,
cellular differentiation
Cellular differentiation is the process in which a stem cell alters from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellular ...
,
cell cycle
The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and subs ...
, or
carcinogenesis
Carcinogenesis, also called oncogenesis or tumorigenesis, is the formation of a cancer, whereby normal cells are transformed into cancer cells. The process is characterized by changes at the cellular, genetic, and epigenetic levels and abno ...
. Further increasing proteome complexity, as mentioned, most proteins are able to undergo a wide range of post-translational modifications.
Therefore, a "proteomics" study may become complex very quickly, even if the topic of study is restricted. In more ambitious settings, such as when a
biomarker
In biomedical contexts, a biomarker, or biological marker, is a measurable indicator of some biological state or condition. Biomarkers are often measured and evaluated using blood, urine, or soft tissues to examine normal biological processes, p ...
for a specific cancer subtype is sought, the proteomics scientist might elect to study multiple blood serum samples from multiple cancer patients to minimise confounding factors and account for experimental noise. Thus, complicated experimental designs are sometimes necessary to account for the dynamic complexity of the proteome.
Limitations of genomics and proteomics studies
Proteomics gives a different level of understanding than genomics for many reasons:
* the level of transcription of a gene gives only a rough estimate of its ''level of translation'' into a protein.
An
mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA is ...
produced in abundance may be degraded rapidly or translated inefficiently, resulting in a small amount of protein.
* as mentioned above, many proteins experience ''
post-translational modification
Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribosome ...
s'' that profoundly affect their activities; for example, some proteins are not active until they become phosphorylated. Methods such as
phosphoproteomics and
glycoproteomics are used to study post-translational modifications.
* many transcripts give rise to more than one protein, through
alternative splicing
Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
or alternative post-translational modifications.
* many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules.
* protein degradation rate plays an important role in protein content.
''Reproducibility''. One major factor affecting reproducibility in proteomics experiments is the simultaneous
elution
In analytical and organic chemistry, elution is the process of extracting one material from another by washing with a solvent; as in washing of loaded ion-exchange resins to remove captured ions.
In a liquid chromatography experiment, for exam ...
of many more peptides than mass spectrometers can measure. This causes
stochastic
Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...
differences between experiments due to
data-dependent acquisition
Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more mass analyzers are coupled together using an additional reaction step to increase their abilities to analyse chemical samples. A com ...
of tryptic peptides. Although early large-scale shotgun proteomics analyses showed considerable variability between laboratories,
presumably due in part to technical and experimental differences between laboratories, reproducibility has been improved in more recent mass spectrometry analysis, particularly on the protein level. Notably,
targeted proteomics shows increased reproducibility and repeatability compared with shotgun methods, although at the expense of data density and effectiveness.
''Data quality''. Proteomic analysis is highly amenable to automation and large data sets are created, which are processed by software algorithms. Filter parameters are used to reduce the number of false hits, but they cannot be completely eliminated. Scientists have expressed the need for awareness that proteomics experiments should adhere to the criteria of analytical chemistry (sufficient data quality, sanity check, validation).
Methods of studying proteins
In proteomics, there are multiple methods to study proteins. Generally, proteins may be detected by using either
antibodies
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
(immunoassays), electrophoretic separation or
mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
. If a complex biological sample is analyzed, either a very specific antibody needs to be used in quantitative dot blot analysis (QDB), or biochemical separation then needs to be used before the detection step, as there are too many analytes in the sample to perform accurate detection and quantification.
Protein detection with antibodies (immunoassays)
Antibodies
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
to particular proteins, or to their modified forms, have been used in
biochemistry
Biochemistry or biological chemistry is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology and ...
and
cell biology
Cell biology (also cellular biology or cytology) is a branch of biology that studies the structure, function, and behavior of cells. All living organisms are made of cells. A cell is the basic unit of life that is responsible for the living and ...
studies. These are among the most common tools used by molecular biologists today. There are several specific techniques and protocols that use antibodies for protein detection. The
enzyme-linked immunosorbent assay
The enzyme-linked immunosorbent assay (ELISA) (, ) is a commonly used analytical biochemistry assay, first described by Eva Engvall and Peter Perlmann in 1971. The assay uses a solid-phase type of enzyme immunoassay (EIA) to detect the presence ...
(ELISA) has been used for decades to detect and quantitatively measure proteins in samples. The
western blot
The western blot (sometimes called the protein immunoblot), or western blotting, is a widely used analytical technique in molecular biology and immunogenetics to detect specific proteins in a sample of tissue homogenate or extract. Besides detect ...
may be used for detection and quantification of individual proteins, where in an initial step, a complex protein mixture is separated using
SDS-PAGE
SDS-PAGE (sodium dodecyl sulfate–polyacrylamide gel electrophoresis) is a Discontinuous electrophoresis, discontinuous electrophoretic system developed by Ulrich K. Laemmli which is commonly used as a method to separate proteins with molecular m ...
and then the protein of interest is identified using an antibody.
Modified proteins may be studied by developing an
antibody
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
specific to that modification. For example, there are antibodies that only recognize certain proteins when they are tyrosine-
phosphorylated
In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
, they are known as phospho-specific antibodies. Also, there are antibodies specific to other modifications. These may be used to determine the set of proteins that have undergone the modification of interest.
Immunoassays can also be carried out using recombinantly generated immunoglobulin derivatives or synthetically designed protein scaffolds that are selected for high antigen specificity. Such binders include single domain antibody fragments (Nanobodies), designed ankyrin repeat proteins (DARPins) and aptamers.
Disease detection at the molecular level is driving the emerging revolution of early diagnosis and treatment. A challenge facing the field is that protein biomarkers for early diagnosis may be present in very low abundance. The lower limit of detection with conventional immunoassay technology is the upper femtomolar range (10
−13 M). Digital immunoassay technology has improved detection sensitivity three logs, to the attomolar range (10
−16 M). This capability has the potential to open new advances in diagnostics and therapeutics, but such technologies have been relegated to manual procedures that are not well suited for efficient routine use.
Antibody-free protein detection
While protein detection with antibodies is still very common in molecular biology, other methods have been developed as well, that do not rely on an antibody. These methods offer various advantages, for instance they often are able to determine the sequence of a protein or peptide, they may have higher throughput than antibody-based, and they sometimes can identify and quantify proteins for which no antibody exists.
Detection methods
One of the earliest methods for protein analysis has been
Edman degradation
Edman degradation, developed by Pehr Edman, is a method of sequencing amino acids in a peptide. In this method, the amino-terminal residue is labeled and cleaved from the peptide without disrupting the peptide bonds between other amino acid residu ...
(introduced in 1967) where a single
peptide
Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides.
A ...
is subjected to multiple steps of chemical degradation to resolve its sequence. These early methods have mostly been supplanted by technologies that offer higher throughput.
More recently implemented methods use
mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
-based techniques, a development that was made possible by the discovery of "soft ionization" methods developed in the 1980s, such as
matrix-assisted laser desorption/ionization (MALDI) and
electrospray ionization (ESI). These methods gave rise to the
top-down
Top-down may refer to:
Arts and entertainment
* " Top Down", a 2007 song by Swizz Beatz
* "Top Down", a song by Lil Yachty from ''Lil Boat 3''
* "Top Down", a song by Fifth Harmony from ''Reflection'' Science
* Top-down reading, is a part of ...
and the
bottom-up proteomics workflows where often additional separation is performed before analysis (see below).
Separation methods
For the analysis of complex biological samples, a reduction of sample complexity is required. This may be performed off-line by
one-dimensional
In physics and mathematics, a sequence of ''n'' numbers can specify a location in ''n''-dimensional space. When , the set of all such locations is called a one-dimensional space. An example of a one-dimensional space is the number line, where the ...
or
two-dimensional
In mathematics, a plane is a Euclidean (flat), two-dimensional surface that extends indefinitely. A plane is the two-dimensional analogue of a point (zero dimensions), a line (one dimension) and three-dimensional space. Planes can arise as s ...
separation. More recently, on-line methods have been developed where individual peptides (in bottom-up proteomics approaches) are separated using
reversed-phase chromatography
Reversed-phase chromatography (also called RPC, reverse-phase chromatography, or hydrophobic chromatography) includes any chromatographic method that uses a hydrophobic stationary phase.
RPC refers to liquid (rather than gas) chromatography.
St ...
and then, directly ionized using
ESI; the direct coupling of separation and analysis explains the term "on-line" analysis.
Hybrid technologies
There are several hybrid technologies that use antibody-based purification of individual analytes and then perform mass spectrometric analysis for identification and quantification. Examples of these methods are the
MSIA (mass spectrometric immunoassay), developed by Randall Nelson in 1995,
and the SISCAPA (Stable Isotope Standard Capture with Anti-Peptide Antibodies) method, introduced by Leigh Anderson in 2004.
Current research methodologies
Fluorescence two-dimensional differential gel electrophoresis (2-D DIGE)
[ may be used to quantify variation in the 2-D DIGE process and establish statistically valid thresholds for assigning quantitative changes between samples.]
Comparative proteomic analysis may reveal the role of proteins in complex biological systems, including reproduction. For example, treatment with the insecticide triazophos causes an increase in the content of brown planthopper (''Nilaparvata lugens'' (Stål)) male accessory gland proteins (Acps) that may be transferred to females via mating, causing an increase in fecundity (i.e. birth rate) of females. To identify changes in the types of accessory gland proteins (Acps) and reproductive proteins that mated female planthoppers received from male planthoppers, researchers conducted a comparative proteomic analysis of mated ''N. lugens'' females. The results indicated that these proteins participate in the reproductive process of ''N. lugens'' adult females and males.
Proteome analysis of ''Arabidopsis peroxisomes''[ has been established as the major unbiased approach for identifying new peroxisomal proteins on a large scale.]
There are many approaches to characterizing the human proteome, which is estimated to contain between 20,000 and 25,000 non-redundant proteins. The number of unique protein species likely will increase by between 50,000 and 500,000 due to RNA splicing and proteolysis events, and when post-translational modification also are considered, the total number of unique human proteins is estimated to range in the low millions.
In addition, the first promising attempts to decipher the proteome of animal tumors have recently been reported.
This method was used as a functional method in ''Macrobrachium rosenbergii
''Macrobrachium rosenbergii'', also known as the giant river prawn or giant freshwater prawn, is a commercially important species of Palaemonidae, palaemonid freshwater prawn. It is found throughout the tropical and subtropical areas of the Ind ...
'' protein profiling.
High-throughput proteomic technologies
Proteomics has steadily gained momentum over the past decade with the evolution of several approaches. Few of these are new, and others build on traditional methods. Mass spectrometry-based methods, affinity proteomics, and micro arrays are the most common technologies for large-scale study of proteins.
Mass spectrometry and protein profiling
There are two mass spectrometry-based methods currently used for protein profiling. The more established and widespread method uses high resolution, two-dimensional electrophoresis to separate proteins from different samples in parallel, followed by selection and staining of differentially expressed proteins to be identified by mass spectrometry. Despite the advances in 2-DE and its maturity, it has its limits as well.
The central concern is the inability to resolve all the proteins within a sample, given their dramatic range in expression level and differing properties. The combination of pore size, and protein charge, size and shape can greatly determine migration rate which leads to other complications.
The second quantitative approach uses stable isotope tags to differentially label proteins from two different complex mixtures. Here, the proteins within a complex mixture are labeled isotopically first, and then digested to yield labeled peptides. The labeled mixtures are then combined, the peptides separated by multidimensional liquid chromatography and analyzed by tandem mass spectrometry. Isotope coded affinity tag (ICAT) reagents are the widely used isotope tags. In this method, the cysteine residues of proteins get covalently attached to the ICAT reagent, thereby reducing the complexity of the mixtures omitting the non-cysteine residues.
Quantitative proteomics using stable isotopic tagging is an increasingly useful tool in modern development. Firstly, chemical reactions have been used to introduce tags into specific sites or proteins for the purpose of probing specific protein functionalities. The isolation of phosphorylated peptides has been achieved using isotopic labeling and selective chemistries to capture the fraction of protein among the complex mixture. Secondly, the ICAT technology was used to differentiate between partially purified or purified macromolecular complexes such as large RNA polymerase II pre-initiation complex and the proteins complexed with yeast transcription factor. Thirdly, ICAT labeling was recently combined with chromatin isolation to identify and quantify chromatin-associated proteins. Finally ICAT reagents are useful for proteomic profiling of cellular organelles and specific cellular fractions.
Another quantitative approach is the accurate mass and time (AMT) tag approach developed by Richard D. Smith and coworkers at Pacific Northwest National Laboratory
Pacific Northwest National Laboratory (PNNL) is one of the United States Department of Energy national laboratories, managed by the Department of Energy's (DOE) Office of Science. The main campus of the laboratory is in Richland, Washington.
O ...
. In this approach, increased throughput and sensitivity is achieved by avoiding the need for tandem mass spectrometry, and making use of precisely determined separation time information and highly accurate mass determinations for peptide and protein identifications.
Affinity proteomics
Affinity proteomics uses antibodies or other affinity reagents (such as oligonucleotide-based aptamers) as protein-specific detection probes. Currently this method can interrogate several thousand proteins, typically from biofluids such as plasma, serum or cerebrospinal fluid (CSF). A key differentiator for this technology is the ability to analyze hundreds or thousands of samples in a reasonable timeframe (a matter of days or weeks); mass spectrometry-based methods are not scalable to this level of sample throughput for proteomics analyses.
Protein chips
Balancing the use of mass spectrometers in proteomics and in medicine is the use of protein micro arrays. The aim behind protein micro arrays is to print thousands of protein detecting features for the interrogation of biological samples. Antibody arrays are an example in which a host of different antibodies are arrayed to detect their respective antigens from a sample of human blood. Another approach is the arraying of multiple protein types for the study of properties like protein-DNA, protein-protein and protein-ligand interactions. Ideally, the functional proteomic arrays would contain the entire complement of the proteins of a given organism. The first version of such arrays consisted of 5000 purified proteins from yeast deposited onto glass microscopic slides. Despite the success of first chip, it was a greater challenge for protein arrays to be implemented. Proteins are inherently much more difficult to work with than DNA. They have a broad dynamic range, are less stable than DNA and their structure is difficult to preserve on glass slides, though they are essential for most assays. The global ICAT technology has striking advantages over protein chip technologies.
Reverse-phased protein microarrays
This is a promising and newer microarray application for the diagnosis, study and treatment of complex diseases such as cancer. The technology merges laser capture microdissection (LCM) with micro array technology, to produce reverse-phase protein microarrays. In this type of microarrays, the whole collection of protein themselves are immobilized with the intent of capturing various stages of disease within an individual patient. When used with LCM, reverse phase arrays can monitor the fluctuating state of proteome among different cell population within a small area of human tissue. This is useful for profiling the status of cellular signaling molecules, among a cross-section of tissue that includes both normal and cancerous cells. This approach is useful in monitoring the status of key factors in normal prostate epithelium and invasive prostate cancer tissues. LCM then dissects these tissue and protein lysates were arrayed onto nitrocellulose slides, which were probed with specific antibodies. This method can track all kinds of molecular events and can compare diseased and healthy tissues within the same patient enabling the development of treatment strategies and diagnosis. The ability to acquire proteomics snapshots of neighboring cell populations, using reverse-phase microarrays in conjunction with LCM has a number of applications beyond the study of tumors. The approach can provide insights into normal physiology and pathology of all the tissues and is invaluable for characterizing developmental processes and anomalies.
Protein Detection via Bioorthogonal Chemistry
Recent advancements in bioorthogonal chemistry
The term bioorthogonal chemistry refers to any chemical reaction that can occur inside of living systems without interfering with native biochemical processes. The term was coined by Carolyn R. Bertozzi in 2003. Since its introduction, the concept ...
have revealed applications in protein analysis. The extension of using organic molecules to observe their reaction with proteins reveals extensive methods to tag them. Unnatural amino acids and various functional groups
In organic chemistry, a functional group is a substituent or moiety in a molecule that causes the molecule's characteristic chemical reactions. The same functional group will undergo the same or similar chemical reactions regardless of the rest ...
represent new growing technologies in proteomics.
Specific biomolecules that are capable of being metabolized in cells or tissues are inserted into proteins or glycans. The molecule will have an affinity tag, modifying the protein allowing it to be detected. Azidohomoalanine (AHA) utilizes this affinity tag via incorporation with Met-t-RNA synthetase to incorporate into proteins. This has allowed AHA to assist in determine the identity of newly synthesized proteins created in response to perturbations and to identify proteins secreted by cells.
Recent studies using ketones
In organic chemistry, a ketone is a functional group with the structure R–C(=O)–R', where R and R' can be a variety of carbon-containing substituents. Ketones contain a carbonyl group –C(=O)– (which contains a carbon-oxygen double bon ...
and aldehydes
In organic chemistry, an aldehyde () is an organic compound containing a functional group with the structure . The functional group itself (without the "R" side chain) can be referred to as an aldehyde but can also be classified as a formyl grou ...
condensations show that they are best suited for in vitro or cell surface labeling. However, using ketones and aldehydes as bioorthogonal reporters revealed slow kinetics indicating that while effective for labeling, the concentration must be high.
Certain proteins can be detected via their reactivity to azide groups. Non-proteinogenic amino acids
In biochemistry, non-coded or non-proteinogenic amino acids are distinct from the 22 proteinogenic amino acids (21 in eukaryotesplus formylmethionine in eukaryotes with prokaryote organelles like mitochondria) which are naturally encoded in the g ...
can bear azide groups which react with phosphines in Staudinger ligations. This reaction has already been used to label other biomolecules in living cells and animals.
The bioorthoganal field is expanding and is driving further applications within proteomics. It is worthwhile noting the limitations and benefits. Rapid reactions can create bioconjuctions and create high concentrations with low amounts of reactants. Contrarily slow kinetic reactions like aldehyde and ketone condensation while effective require a high concentration making it cost inefficient.
Practical applications
New drug discovery
One major development to come from the study of human genes and proteins has been the identification of potential new drugs for the treatment of disease. This relies on genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
and proteome
The proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. ...
information to identify proteins associated with a disease, which computer software can then use as targets for new drugs. For example, if a certain protein is implicated in a disease, its 3D structure provides the information to design drugs to interfere with the action of the protein. A molecule that fits the active site of an enzyme, but cannot be released by the enzyme, inactivates the enzyme. This is the basis of new drug-discovery tools, which aim to find new drugs to inactivate proteins involved in disease. As genetic differences among individuals are found, researchers expect to use these techniques to develop personalized drugs that are more effective for the individual.
Proteomics is also used to reveal complex plant-insect interactions that help identify candidate genes involved in the defensive response of plants to herbivory.
A branch of proteomics called chemoproteomics Chemoproteomics entails a broad array of techniques used to identify and interrogate protein- small molecule interactions. Chemoproteomics complements phenotypic drug discovery, a paradigm that aims to discover lead compounds on the basis of allev ...
provides numerous tools and techniques to detect protein targets of drugs.
Interaction proteomics and protein networks
Interaction proteomics is the analysis of protein interactions from scales of binary interactions to proteome- or network-wide. Most proteins function via protein–protein interaction
Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and th ...
s, and one goal of interaction proteomics is to identify binary protein interactions, protein complexes
A protein complex or multiprotein complex is a group of two or more associated polypeptide chains. Protein complexes are distinct from multienzyme complexes, in which multiple catalytic domains are found in a single polypeptide chain.
Protein ...
, and interactome In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules (such as those among proteins, also known as protein–protein interactions, ...
s.
Several methods are available to probe protein–protein interactions. While the most traditional method is yeast two-hybrid analysis
Two-hybrid screening (originally known as yeast two-hybrid system or Y2H) is a molecular biology technique used to discover protein–protein interactions (PPIs) and protein–DNA interactions by testing for physical interactions (such as bindi ...
, a powerful emerging method is affinity purification
Affinity chromatography is a method of separating a biomolecule from a mixture, based on a highly specific macromolecular binding interaction between the biomolecule and another substance. The specific type of binding interaction depends on the ...
followed by protein mass spectrometry
Protein mass spectrometry refers to the application of mass spectrometry to the study of proteins. Mass spectrometry is an important method for the accurate mass determination and characterization of proteins, and a variety of methods and instru ...
using tagged protein baits. Other methods include surface plasmon resonance
Surface plasmon resonance (SPR) is the resonant oscillation of conduction electrons at the interface between negative and positive permittivity material in a particle stimulated by incident light. SPR is the basis of many standard tools for measu ...
(SPR), protein microarray
A protein microarray (or protein chip) is a high-throughput method used to track the interactions and activities of proteins, and to determine their function, and determining function on a large scale. Its main advantage lies in the fact that larg ...
s, dual polarisation interferometry
Dual-polarization interferometry (DPI) is an analytical technique that probes molecular layers adsorbed to the surface of a waveguide using the evanescent wave of a laser beam. It is used to measure the conformational change in proteins, or othe ...
, microscale thermophoresis
Microscale thermophoresis (MST) is a technology for the biophysical analysis of interactions between biomolecules. Microscale thermophoresis is based on the detection of a temperature-induced change in fluorescence of a target as a function of th ...
, kinetic exclusion assay, and experimental methods such as phage display and ''in silico'' computational methods.
Knowledge of protein-protein interactions is especially useful in regard to biological network
A biological network is a method of representing systems as complex sets of binary interactions or relations between various biological entities. In general, networks or graphs are used to capture relationships between entities or objects. A typi ...
s and systems biology
Systems biology is the computational modeling, computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological syst ...
, for example in cell signaling
In biology, cell signaling (cell signalling in British English) or cell communication is the ability of a cell to receive, process, and transmit signals with its environment and with itself. Cell signaling is a fundamental property of all cellula ...
cascades and gene regulatory network
A gene (or genetic) regulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the fun ...
s (GRNs, where knowledge of protein-DNA interactions is also informative). Proteome-wide analysis of protein interactions, and integration of these interaction patterns into larger biological network
A biological network is a method of representing systems as complex sets of binary interactions or relations between various biological entities. In general, networks or graphs are used to capture relationships between entities or objects. A typi ...
s, is crucial towards understanding systems-level biology.
Expression proteomics
Expression proteomics includes the analysis of protein expression at a larger scale. It helps identify main proteins in a particular sample, and those proteins differentially expressed in related samples—such as diseased vs. healthy tissue. If a protein is found only in a diseased sample then it can be a useful drug target or diagnostic marker. Proteins with the same or similar expression profiles may also be functionally related. There are technologies such as 2D-PAGE and mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
that are used in expression proteomics.
Biomarkers
The National Institutes of Health
The National Institutes of Health, commonly referred to as NIH (with each letter pronounced individually), is the primary agency of the United States government responsible for biomedical and public health research. It was founded in the late ...
has defined a biomarker as "a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention."
Understanding the proteome, the structure and function of each protein and the complexities of protein–protein interactions are critical for developing the most effective diagnostic techniques and disease treatments in the future. For example, proteomics is highly useful in the identification of candidate biomarkers (proteins in body fluids that are of value for diagnosis), identification of the bacterial antigens that are targeted by the immune response, and identification of possible immunohistochemistry markers of infectious or neoplastic diseases.
An interesting use of proteomics is using specific protein biomarkers to diagnose disease. A number of techniques allow to test for proteins produced during a particular disease, which helps to diagnose the disease quickly. Techniques include western blot
The western blot (sometimes called the protein immunoblot), or western blotting, is a widely used analytical technique in molecular biology and immunogenetics to detect specific proteins in a sample of tissue homogenate or extract. Besides detect ...
, immunohistochemical staining
Immunohistochemistry (IHC) is the most common application of immunostaining. It involves the process of selectively identifying antigens (proteins) in cells of a tissue section by exploiting the principle of antibodies binding specifically to ant ...
, enzyme linked immunosorbent assay (ELISA) or mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
. Secretomics Secretomics is a type of proteomics which involves the analysis of the secretome—all the secreted proteins of a cell, tissue or organism. Secreted proteins are involved in a variety of physiological processes, including cell signaling and matrix ...
, a subfield of proteomics that studies secreted proteins and secretion pathways using proteomic approaches, has recently emerged as an important tool for the discovery of biomarkers of disease.
Proteogenomics
In proteogenomics
Proteogenomics is a field of biological research that utilizes a combination of proteomics, genomics, and transcriptomics to aid in the discovery and identification of peptides. Proteogenomics is used to identify new peptides by comparing MS/MS sp ...
, proteomic technologies such as mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
are used for improving gene annotation
DNA annotation or genome annotation is the process of identifying the locations of genes and all of the coding regions in a genome and determining what those genes do. An annotation (irrespective of the context) is a note added by way of explanat ...
s. Parallel analysis of the genome and the proteome facilitates discovery of post-translational modifications and proteolytic events, especially when comparing multiple species (comparative proteogenomics).
Structural proteomics
Structural proteomics includes the analysis of protein structures at large-scale. It compares protein structures and helps identify functions of newly discovered genes. The structural analysis also helps to understand that where drugs bind to proteins and also shows where proteins interact with each other. This understanding is achieved using different technologies such as X-ray crystallography and NMR spectroscopy.
Bioinformatics for proteomics (proteome informatics)
Much proteomics data is collected with the help of high throughput technologies such as mass spectrometry
Mass spectrometry (MS) is an analytical technique that is used to measure the mass-to-charge ratio of ions. The results are presented as a ''mass spectrum'', a plot of intensity as a function of the mass-to-charge ratio. Mass spectrometry is use ...
and microarray. It would often take weeks or months to analyze the data and perform comparisons by hand. For this reason, biologists and chemists are collaborating with computer scientists and mathematicians to create programs and pipeline to computationally analyze the protein data. Using bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
techniques, researchers are capable of faster analysis and data storage. A good place to find lists of current programs and databases is on the ExPASy bioinformatics resource portal. The applications of bioinformatics-based proteomics include medicine, disease diagnosis, biomarker identification, and many more.
Protein identification
Mass spectrometry and microarray produce peptide fragmentation information but do not give identification of specific proteins present in the original sample. Due to the lack of specific protein identification, past researchers were forced to decipher the peptide fragments themselves. However, there are currently programs available for protein identification. These programs take the peptide sequences output from mass spectrometry and microarray and return information about matching or similar proteins. This is done through algorithms implemented by the program which perform alignments with proteins from known databases such as UniProt and PROSITE
PROSITE is a protein database. It consists of entries describing the protein families, domains and functional sites as well as amino acid patterns and profiles in them. These are manually curated by a team of the Swiss Institute of Bioinformatic ...
to predict what proteins are in the sample with a degree of certainty.
Protein structure
The biomolecular structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule of protein, DNA, or RNA, and that is important to its function. The structure of these molecules may be considered at any of several length s ...
forms the 3D configuration of the protein. Understanding the protein's structure aids in the identification of the protein's interactions and function. It used to be that the 3D structure of proteins could only be determined using X-ray crystallography
X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
and NMR spectroscopy
Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a spectroscopic technique to observe local magnetic fields around atomic nuclei. The sample is placed in a magnetic fiel ...
. As of 2017, Cryo-electron microscopy
Cryogenic electron microscopy (cryo-EM) is a cryomicroscopy technique applied on samples cooled to cryogenic temperatures. For biological specimens, the structure is preserved by embedding in an environment of vitreous ice. An aqueous sample s ...
is a leading technique, solving difficulties with crystallization (in X-ray crystallography) and conformational ambiguity (in NMR); resolution was 2.2Ã… as of 2015. Now, through bioinformatics, there are computer programs that can in some cases predict and model the structure of proteins. These programs use the chemical properties of amino acids and structural properties of known proteins to predict the 3D model of sample proteins. This also allows scientists to model protein interactions on a larger scale. In addition, biomedical engineers are developing methods to factor in the flexibility of protein structures to make comparisons and predictions.
Post-translational modifications
Most programs available for protein analysis are not written for proteins that have undergone post-translational modifications
Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribosomes ...
. Some programs will accept post-translational modifications to aid in protein identification but then ignore the modification during further protein analysis. It is important to account for these modifications since they can affect the protein's structure. In turn, computational analysis of post-translational modifications has gained the attention of the scientific community. The current post-translational modification programs are only predictive. Chemists, biologists and computer scientists are working together to create and introduce new pipelines that allow for analysis of post-translational modifications that have been experimentally identified for their effect on the protein's structure and function.
Computational methods in studying protein biomarkers
One example of the use of bioinformatics and the use of computational methods is the study of protein biomarkers. Computational predictive models have shown that extensive and diverse feto-maternal protein trafficking occurs during pregnancy and can be readily detected non-invasively in maternal whole blood. This computational approach circumvented a major limitation, the abundance of maternal proteins interfering with the detection of fetal proteins, to fetal proteomic analysis of maternal blood. Computational models can use fetal gene transcripts previously identified in maternal whole blood
Whole blood (WB) is human blood from a standard blood donation. It is used in the treatment of massive bleeding, in exchange transfusion, and when people donate blood to themselves. One unit of whole blood (~517 mls) brings up hemoglobin lev ...
to create a comprehensive proteomic network of the term neonate
An infant or baby is the very young offspring of human beings. ''Infant'' (from the Latin word ''infans'', meaning 'unable to speak' or 'speechless') is a formal or specialised synonym for the common term ''baby''. The terms may also be used to ...
. Such work shows that the fetal proteins detected in pregnant woman's blood originate from a diverse group of tissues and organs from the developing fetus. The proteomic networks contain many biomarkers
In biomedical contexts, a biomarker, or biological marker, is a measurable indicator of some biological state or condition. Biomarkers are often measured and evaluated using blood, urine, or soft tissues to examine normal biological processes, p ...
that are proxies for development and illustrate the potential clinical application of this technology as a way to monitor normal and abnormal fetal development.
An information-theoretic framework has also been introduced for biomarker
In biomedical contexts, a biomarker, or biological marker, is a measurable indicator of some biological state or condition. Biomarkers are often measured and evaluated using blood, urine, or soft tissues to examine normal biological processes, p ...
discovery, integrating biofluid and tissue information. This new approach takes advantage of functional synergy between certain biofluids and tissues with the potential for clinically significant findings not possible if tissues and biofluids were considered individually. By conceptualizing tissue-biofluid as information channels, significant biofluid proxies can be identified and then used for the guided development of clinical diagnostics. Candidate biomarkers are then predicted based on information transfer criteria across the tissue-biofluid channels. Significant biofluid-tissue relationships can be used to prioritize clinical validation of biomarkers.
Emerging trends
A number of emerging concepts have the potential to improve the current features of proteomics. Obtaining absolute quantification of proteins and monitoring post-translational modifications are the two tasks that impact the understanding of protein function in healthy and diseased cells. For many cellular events, the protein concentrations do not change; rather, their function is modulated by post-translational modifications (PTM). Methods of monitoring PTM are an underdeveloped area in proteomics. Selecting a particular subset of protein for analysis substantially reduces protein complexity, making it advantageous for diagnostic purposes where blood is the starting material. Another important aspect of proteomics, yet not addressed, is that proteomics methods should focus on studying proteins in the context of the environment. The increasing use of chemical cross-linkers, introduced into living cells to fix protein-protein, protein-DNA and other interactions, may ameliorate this problem partially. The challenge is to identify suitable methods of preserving relevant interactions. Another goal for studying protein is to develop more sophisticated methods to image proteins and other molecules in living cells and real-time.
Systems biology
Advances in quantitative proteomics would clearly enable more in-depth analysis of cellular systems. Another research frontier is the analysis of single cells, and protein covariation across single cells which reflects biological processes such as protein complex formation, immune functions, as well as cell cycle and priming of cancer cells for drug resistance Biological systems are subject to a variety of perturbations (cell cycle
The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and subs ...
, cellular differentiation
Cellular differentiation is the process in which a stem cell alters from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellular ...
, carcinogenesis
Carcinogenesis, also called oncogenesis or tumorigenesis, is the formation of a cancer, whereby normal cells are transformed into cancer cells. The process is characterized by changes at the cellular, genetic, and epigenetic levels and abno ...
, environment (biophysical)
A biophysical environment is a biotic and abiotic surrounding of an organism or population, and consequently includes the factors that have an influence in their survival, development, and evolution. A biophysical environment can vary in scale f ...
, etc.). Transcriptional and translational responses to these perturbations results in functional changes to the proteome implicated in response to the stimulus. Therefore, describing and quantifying proteome-wide changes in protein abundance is crucial towards understanding biological phenomenon more holistically, on the level of the entire system. In this way, proteomics can be seen as complementary to genomics
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
, transcriptomics
Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. He ...
, epigenomics
Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigen ...
, metabolomics
Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism. Specifically, metabolomics is the "systematic study of the unique chemical fingerprin ...
, and other -omics approaches in integrative analyses attempting to define biological phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological proper ...
s more comprehensively. As an example, ''The Cancer Proteome Atlas'' provides quantitative protein expression data for ~200 proteins in over 4,000 tumor samples with matched transcriptomic and genomic data from The Cancer Genome Atlas
''The'' () is a grammatical article in English, denoting persons or things already mentioned, under discussion, implied or otherwise presumed familiar to listeners, readers, or speakers. It is the definite article in English. ''The'' is the m ...
. Similar datasets in other cell types, tissue types, and species, particularly using deep shotgun mass spectrometry, will be an immensely important resource for research in fields like cancer biology, developmental
Development of the human body is the process of growth to maturity. The process begins with fertilization, where an egg released from the ovary of a female is penetrated by a sperm cell from a male. The resulting zygote develops through mitosi ...
and stem cell
In multicellular organisms, stem cells are undifferentiated or partially differentiated cells that can differentiate into various types of cells and proliferate indefinitely to produce more of the same stem cell. They are the earliest type o ...
biology, medicine
Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care pract ...
, and evolutionary biology
Evolutionary biology is the subfield of biology that studies the evolutionary processes (natural selection, common descent, speciation) that produced the diversity of life on Earth. It is also defined as the study of the history of life fo ...
.
Human plasma proteome
Characterizing the human plasma proteome has become a major goal in the proteomics arena, but it is also the most challenging proteomes of all human tissues. It contains immunoglobulin, cytokines, protein hormones, and secreted proteins indicative of infection on top of resident, hemostatic proteins. It also contains tissue leakage proteins due to the blood circulation through different tissues in the body. The blood thus contains information on the physiological state of all tissues and, combined with its accessibility, makes the blood proteome invaluable for medical purposes. It is thought that characterizing the proteome of blood plasma is a daunting challenge.
The depth of the plasma proteome encompasses a dynamic range of more than 1010 between the highest abundant protein (albumin) and the lowest (some cytokines) and is thought to be one of the main challenges for proteomics. Temporal and spatial dynamics further complicate the study of human plasma proteome. The turnover of some proteins is quite faster than others and the protein content of an artery may substantially vary from that of a vein. All these differences make even the simplest proteomic task of cataloging the proteome seem out of reach. To tackle this problem, priorities need to be established. Capturing the most meaningful subset of proteins among the entire proteome to generate a diagnostic tool is one such priority. Secondly, since cancer is associated with enhanced glycosylation of proteins, methods that focus on this part of proteins will also be useful. Again: multiparameter analysis best reveals a pathological state. As these technologies improve, the disease profiles should be continually related to respective gene expression changes. Due to the above-mentioned problems plasma proteomics remained challenging. However, technological advancements and continuous developments seem to result in a revival of plasma proteomics as it was shown recently by a technology called plasma proteome profiling. Due to such technologies researchers were able to investigate inflammation processes in mice, the heritability of plasma proteomes as well as to show the effect of such a common life style change like weight loss on the plasma proteome.
Journals
Numerous journals are dedicated to the field of proteomics and related areas. Note that journals dealing with ''protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
s'' are usually more focused on structure and function while ''proteomics'' journals are more focused on the large-scale analysis of whole proteomes or at least large sets of proteins. Some of the more important ones are listed below (with their publishers).
* '' Molecular and Cellular Proteomics'' ( ASBMB)
* ''Journal of Proteome Research
The ''Journal of Proteome Research'' is a peer-reviewed scientific journal published since 2002 by the American Chemical Society. Its publication frequency switched from bimonthly to monthly in 2006. The current editor-in-chief is John R. Yates. ...
'' ( ACS)
* '' Journal of Proteomics'' (Elsevier
Elsevier () is a Dutch academic publishing company specializing in scientific, technical, and medical content. Its products include journals such as ''The Lancet'', ''Cell'', the ScienceDirect collection of electronic journals, '' Trends'', th ...
)
* '' Proteomics'' (Wiley
Wiley may refer to:
Locations
* Wiley, Colorado, a U.S. town
* Wiley, Pleasants County, West Virginia, U.S.
* Wiley-Kaserne, a district of the city of Neu-Ulm, Germany
People
* Wiley (musician), British grime MC, rapper, and producer
* Wiley Mil ...
)
See also
* Activity based proteomics
Activity-based proteomics, or activity-based protein profiling (ABPP) is a functional proteomic technology that uses chemical probes that react with mechanistically related classes of enzymes.
Description
The basic unit of ABPP is the probe, wh ...
* Bottom-up proteomics
* Cytomics
Cytomics is the study of cell biology (cytology) and biochemistry in cellular systems at the single cell level. It combines all the bioinformatic knowledge to attempt to understand the molecular architecture and functionality of the cell system (C ...
* Functional genomics
Functional genomics is a field of molecular biology that attempts to describe gene (and protein) functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects (such as genome sequencing ...
* Heat stabilization
Heat stabilization is an additive-free preservation technology for tissue samples which stops degradation and changes immediately and permanently. Heat stabilization uses rapid conductive heating, under controlled pressure, to generate a fast, homo ...
* Human proteome project
* Immunoproteomics
* List of biological databases
Biological databases are stores of biological information. The journal ''Nucleic Acids Research'' regularly publishes special issues on biological databases and has a list of such databases. The 2018 issue has a list of about 180 such databases an ...
* List of omics topics in biology
Inspired by the terms genome and genomics, other words to describe complete biological datasets, mostly sets of biomolecules originating from one organism, have been coined with the suffix '' -ome'' and ''-omics''. Some of these terms are related ...
* PEGylation
PEGylation (or pegylation) is the process of both covalent and non-covalent attachment or amalgamation of polyethylene glycol (PEG, in pharmacy called macrogol) polymer chains to molecules and macrostructures, such as a drug, therapeutic protein ...
* Phosphoproteomics
* Protein production
Protein production is the biotechnological process of generating a specific protein. It is typically achieved by the manipulation of gene expression in an organism such that it expresses large amounts of a recombinant gene. This includes the tran ...
* Proteogenomics
Proteogenomics is a field of biological research that utilizes a combination of proteomics, genomics, and transcriptomics to aid in the discovery and identification of peptides. Proteogenomics is used to identify new peptides by comparing MS/MS sp ...
* Proteomic chemistry
* Secretomics Secretomics is a type of proteomics which involves the analysis of the secretome—all the secreted proteins of a cell, tissue or organism. Secreted proteins are involved in a variety of physiological processes, including cell signaling and matrix ...
* Shotgun proteomics Shotgun proteomics refers to the use of bottom-up proteomics techniques in identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun seq ...
* Top-down proteomics
Top-down proteomics is a method of protein identification that either uses an ion trapping mass spectrometer to store an isolated protein ion for mass measurement and tandem mass spectrometry (MS/MS) analysis or other protein purification methods ...
* Systems biology
Systems biology is the computational modeling, computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological syst ...
* Yeast two-hybrid system
* TCP-seq Translation complex profile sequencing (TCP-seq) is a molecular biology method for obtaining snapshots of momentary distribution of protein synthesis complexes along messenger RNA (mRNA) chains.
Application
Expression of genetic code in all life ...
* glycomics Glycomics is the comprehensive study of glycomes (the entire complement of sugars, whether free or present in more complex molecules of an organism), including genetic, physiologic, pathologic, and other aspects. Glycomics "is the systematic study ...
Protein databases
* Human Protein Atlas
The Human Protein Atlas (HPA) is a Swedish-based program started in 2003 with the aim to map all the human proteins in cells, tissues and organs using integration of various omics technologies, including antibody-based imaging, mass spectrometr ...
* Human Protein Reference Database
The Human Protein Reference Database (HPRD) is a protein database accessible through the Internet. It is closely associated with the premier Indian Non-Profit research organisation Institute of Bioinformatics (IOB), Bangalore. This database is a c ...
* National Center for Biotechnology Information
The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The ...
(NCBI)
* PeptideAtlas PeptideAtlas is a proteomics data resource that gathers tandem mass spectrometry datasets from around the world, reprocesses them with the Trans-Proteomic Pipeline, and makes the combined result freely available to the community. Peptide Atlas is ...
* Protein Data Bank
The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cry ...
(PDB)
* Protein Information Resource The Protein Information Resource (PIR), located at Georgetown University Medical Center, is an integrated public bioinformatics resource to support genomic and proteomic research, and scientific studies. It contains protein sequences databases
Hi ...
(PIR)
* Proteomics Identifications Database The PRIDE (PRoteomics IDEntifications database) is a public data repository of mass spectrometry (MS) based proteomics data, and is maintained by the European Bioinformatics Institute as part of the Proteomics Team.
Originally designed by Lennart ...
(PRIDE)
* Proteopedia
Proteopedia is a wiki, 3D encyclopedia of proteins and other molecules.
The site contains a page for every entry in the Protein Data Bank (>130,000 pages), as well as pages that are more descriptive of protein structures in general such as acetylc ...
—The collaborative, 3D encyclopedia of proteins and other molecules
* Swiss-Prot
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from ...
* UniProt
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from ...
Research centers
* European Bioinformatics Institute
The European Bioinformatics Institute (EMBL-EBI) is an Intergovernmental Organization (IGO) which, as part of the European Molecular Biology Laboratory (EMBL) family, focuses on research and services in bioinformatics. It is located on the Well ...
* Netherlands Proteomics Centre (NPC)
References
Bibliography
*
*
*
* (electronic, on Netlibrary?), hbk
*
* (focused on 2D-gels, good on detail)
* (covers almost all branches of proteomics)
*
*
*
External links
*
{{Authority control
Genomics