HOME

TheInfoList




Protein structure is the three-dimensional arrangement of atoms in an
amino acid Amino acids are organic compound In , organic compounds are generally any s that contain - . Due to carbon's ability to (form chains with other carbon s), millions of organic compounds are known. The study of the properties, reactions, a ...

amino acid
-chain
molecule A molecule is an electrically Electricity is the set of physical phenomena associated with the presence and motion Image:Leaving Yongsan Station.jpg, 300px, Motion involves a change in position In physics, motion is the phenomenon ...

molecule
.
Protein Proteins are large biomolecule , showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, for which they received a No ...

Protein
s are
polymer A polymer (; Greek ''poly- Poly, from the Greek :wikt:πολύς, πολύς meaning "many" or "much", may refer to: Businesses * China Poly Group Corporation, a Chinese business group, and its subsidiaries: ** Poly Property, a Hong Kong inc ...

polymer
s specifically
polypeptide Peptides (from Greek language πεπτός, ''peptós'' "digested"; derived from πέσσειν, ''péssein'' "to digest") are short chains of amino acids linked by peptide bonds. Chains of fewer than ten or fifteen amino acids are called oligope ...
s formed from sequences of
amino acid Amino acids are organic compound In , organic compounds are generally any s that contain - . Due to carbon's ability to (form chains with other carbon s), millions of organic compounds are known. The study of the properties, reactions, a ...

amino acid
s, the
monomer In chemistry Chemistry is the study of the properties and behavior of . It is a that covers the that make up matter to the composed of s, s and s: their composition, structure, properties, behavior and the changes they undergo during a ...

monomer
s of the polymer. A single amino acid monomer may also be called a ''residue'' indicating a repeating unit of a polymer. Proteins form by amino acids undergoing
condensation reaction In , a condensation reaction is a type of in which two s are to form a single molecule, usually with the loss of a small molecule such as . If water is lost, the reaction is also known as a . However other molecules can also be lost, such as , , ...
s, in which the amino acids lose one
water molecule Water () is a Chemical polarity, polar inorganic compound that is at room temperature a tasteless and odorless liquid, which is nearly colorless apart from Color of water, an inherent hint of blue. It is by far the most studied chemical compou ...

water molecule
per
reaction Reaction may refer to a process or to a response to an action, event, or exposure: Physics and chemistry *Chemical reaction A chemical reaction is a process that leads to the IUPAC nomenclature for organic transformations, chemical transformat ...

reaction
in order to attach to one another with a
peptide bond In organic chemistry, a peptide bond is an amide type of Covalent bond, covalent chemical bond linking two consecutive alpha-amino acids from C1 (carbon number one) of one alpha-amino acid and N2 (nitrogen number two) of another, along a peptide o ...

peptide bond
. By convention, a chain under 30 amino acids is often identified as a
peptide Peptides (from Greek language Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Indo-European family of languages, native to Greece, Cyprus, Albania, ...
, rather than a protein. To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of
non-covalent interactionA non-covalent interaction differs from a covalent bond in that it does not involve the sharing of electrons, but rather involves more dispersed variations of electromagnetic interactions between molecules or within a molecule. The chemical energy re ...
s such as
hydrogen bonding A hydrogen bond (or H-bond) is a primarily Electrostatics, electrostatic force of attraction between a hydrogen Hydrogen is the chemical element Image:Simple Periodic Table Chart-blocks.svg, 400px, Periodic table, The periodic table of ...
,
ionic interaction Ionic bonding is a type of chemical bond A chemical bond is a lasting attraction between atoms, ions or molecules that enables the formation of chemical compounds. The bond may result from the Coulomb's law, electrostatic force of attraction be ...
s,
Van der Waals forces Microfiber cloth makes use of London-dispersion force to remove dirt without scratches. In molecular physics, the van der Waals force, named after Dutch physicist Johannes Diderik van der Waals, is a distance-dependent interaction between atoms ...
, and
hydrophobic In chemistry Chemistry is the scientific Science () is a systematic enterprise that builds and organizes knowledge Knowledge is a familiarity or awareness, of someone or something, such as facts A fact is an occurrence ...
packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as
X-ray crystallography X-ray crystallography (XRC) is the experimental science determining the atomic and molecular structure of a crystal A crystal or crystalline solid is a solid material whose constituents (such as atoms, molecules, or ions) are arranged in a ...

X-ray crystallography
,
NMR spectroscopy Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a technique to observe local magnetic fields around . The sample is placed in a magnetic field and the NMR signal is pr ...
, cryo electron microscopy (cryo-EM) and
dual polarisation interferometry Dual-polarization interferometry (DPI) is an analytical technique that probes molecular layers adsorbed to the surface of a waveguide using the evanescent wave of a laser beam. It is used to measure the conformational change in proteins, or othe ...
to determine the structure of proteins. Protein structures range in size from tens to several thousand amino acids. By physical size, proteins are classified as
nanoparticle A nanoparticle or ultrafine particle is usually defined as a particle of matter In classical physics and general chemistry, matter is any substance that has mass and takes up space by having volume. All everyday objects that can be touched ...

nanoparticle
s, between 1–100 nm. Very large
protein complexes is a protein complex functioning as a molecular biological machine. It uses protein dynamics#Global_flexibility:_multiple_domains , protein domain dynamics on Nanoscopic scale, nanoscales A protein complex or multiprotein complex is a group of tw ...
can be formed from
protein subunit 274px, Rendering of HLA-A11 showing the α (A*1101 gene product) and β (Beta-2 microglobin) subunits. This receptor has a bound peptide (in the binding pocket) of heterologous origin that also contributes to function. In structural biology, a p ...
s. For example, many thousands of
actin Actin is a protein family, family of Globular protein, globular multi-functional proteins that form microfilaments. It is found in essentially all Eukaryote, eukaryotic cells, where it may be present at a concentration of over 100 Micromolar, μ ...
molecules assemble into a
microfilament Actin cytoskeleton of mouse embryo fibroblasts, stained with Fluorescein isothiocyanate-phalloidin, 250px Microfilaments, also called actin filaments, are protein filaments in the cytoplasm of eukaryotic cell (biology), cells that form part of the ...
. A protein usually undergoes reversible
structural changes
structural changes
in performing its biological function. The alternative structures of the same protein are referred to as different conformations, and transitions between them are called
conformational change In biochemistry Biochemistry or biological chemistry, is the study of es within and relating to living s. A sub-discipline of both and , biochemistry may be divided into three fields: , and . Over the last decades of the 20th century, bi ...

conformational change
s.


Levels of protein structure

There are four distinct levels of protein structure.


Primary structure

The
primary structure Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule File:Pentacene on Ni(111) STM.jpg, A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon ...

primary structure
of a protein refers to the sequence of
amino acid Amino acids are organic compound In , organic compounds are generally any s that contain - . Due to carbon's ability to (form chains with other carbon s), millions of organic compounds are known. The study of the properties, reactions, a ...

amino acid
s in the polypeptide chain. The primary structure is held together by
peptide bonds In organic chemistry Organic chemistry is a branch of chemistry that studies the structure, properties and reactions of organic compounds, which contain carbon in covalent bonding.Clayden, J.; Greeves, N. and Warren, S. (2012) ''Organic Chemistr ...
that are made during the process of
protein biosynthesis Protein biosynthesis (or protein synthesis) is a core biological process, occurring inside cells Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Closed spaces * Monastic cell, a small ro ...
. The two ends of the
polypeptide chain Peptides (from Greek language Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Indo-European family of languages, native to Greece, Cyprus, Albania, o ...
are referred to as the
carboxyl terminusImage:Tetrapeptide structural formulae v.1.png, 500px, A tetrapeptide (example: Valine, Val-Glycine, Gly-Serine, Ser-Alanine, Ala) with green highlighted ''N''-terminal α-amino acid (example: L-valine) and blue marked ''C''-terminal α-amino acid (e ...
(C-terminus) and the
amino terminus 350px, A Val-Glycine.html"_;"title="Valine.html"_;"title="tetrapeptide_(example:_Valine">Val-Glycine">Gly-Serine">Ser-Alanine.html" ;"title="Glycine">Gly-Serine.html" ;"title="Valine">Val-Glycine.html" ;"title="Valine.html" ;"title="tetrapeptid ...
(N-terminus) based on the nature of the free group on each extremity. Counting of residues always starts at the N-terminal end (NH2-group), which is the end where the amino group is not involved in a peptide bond. The primary structure of a protein is determined by the
gene In biology Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemistry, chemical processes, Molecular biology, molecular interactions, Physiology, physiological mecha ...

gene
corresponding to the protein. A specific sequence of
nucleotide Nucleotides are organic molecules , CH4; is among the simplest organic compounds. In chemistry, organic compounds are generally any chemical compounds that contain carbon-hydrogen chemical bond, bonds. Due to carbon's ability to Catenation, ...

nucleotide
s in
DNA Deoxyribonucleic acid (; DNA) is a molecule File:Pentacene on Ni(111) STM.jpg, A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings. A molecule is an electrically neutral gro ...

DNA
is transcribed into
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA i ...

mRNA
, which is read by the
ribosome Ribosomes ( ), also called Palade granules, are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (mRNA translation). Ribosomes link amino acids ...

ribosome
in a process called
translation Translation is the communication of the meaning Meaning most commonly refers to: * Meaning (linguistics), meaning which is communicated through the use of language * Meaning (philosophy), definition, elements, and types of meaning discusse ...

translation
. The sequence of amino acids in insulin was discovered by
Frederick Sanger Frederick Sanger (; 13 August 1918 – 19 November 2013) was a British biochemist Biochemists are scientists who are trained in biochemistry Biochemistry or biological chemistry, is the study of chemical processes within and relatin ...

Frederick Sanger
, establishing that proteins have defining amino acid sequences. The sequence of a protein is unique to that protein, and defines the structure and function of the protein. The sequence of a protein can be determined by methods such as
Edman degradationEdman degradation, developed by Pehr Edman, is a method of sequencing amino acids Amino acids are organic compounds that contain amino (–NH2) and Carboxylic acid, carboxyl (–COOH) functional groups, along with a Substituent, side chain (R ...
or
tandem mass spectrometry time-of-flight hybrid tandem mass spectrometer. Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more mass analyzers are coupled together using an additional reaction step to increase the ...
. Often, however, it is read directly from the sequence of the gene using the
genetic code The genetic code is the set of rules used by living cells Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Closed spaces * Monastic cell, a small room, hut, or cave in which a monk or rel ...

genetic code
. It is strictly recommended to use the words "amino acid residues" when discussing proteins because when a peptide bond is formed, a
water molecule Water () is a Chemical polarity, polar inorganic compound that is at room temperature a tasteless and odorless liquid, which is nearly colorless apart from Color of water, an inherent hint of blue. It is by far the most studied chemical compou ...

water molecule
is lost, and therefore proteins are made up of amino acid residues.
Post-translational modification Post-translational modification (PTM) refers to the covalent and generally enzyme, enzymatic modification of proteins following protein biosynthesis. Proteins are synthesized by ribosomes translation (biology), translating mRNA into polypeptide c ...
s such as
phosphorylation In chemistry Chemistry is the study of the properties and behavior of . It is a that covers the that make up matter to the composed of s, s and s: their composition, structure, properties, behavior and the changes they undergo during ...

phosphorylation
s and
glycosylation Glycosylation (see also chemical glycosylationA chemical glycosylation reaction involves the coupling of a glycosyl donor, to a glycosyl acceptor forming a glycoside. If both the donor and acceptor are sugars, then the product is an oligosacchar ...

glycosylation
s are usually also considered a part of the primary structure, and cannot be read from the gene. For example,
insulin Insulin (, from Latin ''insula'', 'island') is a peptide hormone produced by beta cells of the pancreatic islets; it is considered to be the main Anabolism, anabolic hormone of the body. It regulates the metabolism of carbohydrates, fats and p ...

insulin
is composed of 51 amino acids in 2 chains. One chain has 31 amino acids, and the other has 20 amino acids.


Secondary structure

Secondary structure Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings. A molecule is an elect ...

Secondary structure
refers to highly regular local sub-structures on the actual polypeptide backbone chain. Two main types of secondary structure, the
α-helix The alpha helix (α-helix) is a common structural motif, motif in the Protein secondary structure, secondary structure of proteins and is a Screw thread#Handedness, right hand-helix conformation in which every backbone amino, N−H group hydrogen b ...

α-helix
and the
β-strand
β-strand
or
β-sheet The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone chain, backbone hydrogen bonds, ...
s, were suggested in 1951 by
Linus Pauling Linus Carl Pauling (; February 28, 1901 – August 19, 1994) was an American chemist, biochemist, chemical engineer, peace activist, author, and educator. He published more than 1,200 papers and books, of which about 850 dealt with scientific t ...

Linus Pauling
et al. These secondary structures are defined by patterns of
hydrogen bonds A hydrogen bond (or H-bond) is a primarily Electrostatics, electrostatic force of attraction between a hydrogen Hydrogen is the chemical element Image:Simple Periodic Table Chart-blocks.svg, 400px, Periodic table, The periodic table of ...
between the main-chain peptide groups. They have a regular geometry, being constrained to specific values of the dihedral angles ψ and φ on the
Ramachandran plot In biochemistry, a Ramachandran plot (also known as a Rama plot, a Ramachandran diagram or a plot), originally developed in 1963 by Gopalasamudram Narayana Ramachandran, G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan, is a way to ...
. Both the α-helix and the β-sheet represent a way of saturating all the hydrogen bond donors and acceptors in the peptide backbone. Some parts of the protein are ordered but do not form any regular structures. They should not be confused with
random coil A random coil is a polymer A polymer (; Greek ''wikt:poly-, poly-'', "many" + ''wikt:-mer, -mer'', "part") is a Chemical substance, substance or material consisting of very large molecules, or macromolecules, composed of many Repeat unit, repe ...
, an unfolded polypeptide chain lacking any fixed three-dimensional structure. Several sequential secondary structures may form a " supersecondary unit".


Tertiary structure

Tertiary structure Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings. A molecule is an elect ...

Tertiary structure
refers to the three-dimensional structure created by a single protein molecule (a single
polypeptide chain Peptides (from Greek language Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Indo-European family of languages, native to Greece, Cyprus, Albania, o ...
). It may include one or several domains. The α-helixes and β-pleated-sheets are folded into a compact globular structure. The folding is driven by the ''non-specific''
hydrophobic interactions thumbnail, 250px, A droplet of water forms a spherical shape, minimizing contact with the hydrophobic leaf. The hydrophobic effect is the observed tendency of nonpolar substances to aggregate in an aqueous solution and exclude water#Chemical and p ...
, the burial of hydrophobic residues from
water Water (chemical formula H2O) is an Inorganic compound, inorganic, transparent, tasteless, odorless, and Color of water, nearly colorless chemical substance, which is the main constituent of Earth's hydrosphere and the fluids of all known li ...

water
, but the structure is stable only when the parts of a
protein domain A protein domain is a region of the protein's Peptide, polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact protein folding, folded three-dimensional structure. Many proteins consist ...
are locked into place by ''specific'' tertiary interactions, such as salt bridges, hydrogen bonds, and the tight packing of side chains and
disulfide bond In biochemistry Biochemistry or biological chemistry, is the study of chemical process In a scientific Science (from the Latin Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-Europe ...
s. The disulfide bonds are extremely rare in cytosolic proteins, since the
cytosol The cytosol, also known as cytoplasmic matrix or groundplasm, is one of the liquids found inside cells Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Closed spaces * Monastic cell, a s ...
(intracellular fluid) is generally a
reducing
reducing
environment.


Quaternary structure

Quaternary structure is the three-dimensional structure consisting of the aggregation of two or more individual polypeptide chains (subunits) that operate as a single functional unit (
multimer In chemistry Chemistry is the scientific discipline involved with Chemical element, elements and chemical compound, compounds composed of atoms, molecules and ions: their composition, structure, properties, behavior and the changes they underg ...
). The resulting multimer is stabilized by the same
non-covalent interactionA non-covalent interaction differs from a covalent bond in that it does not involve the sharing of electrons, but rather involves more dispersed variations of electromagnetic interactions between molecules or within a molecule. The chemical energy re ...
s and disulfide bonds as in tertiary structure. There are many possible quaternary structure organisations. Complexes of two or more polypeptides (i.e. multiple subunits) are called
multimer In chemistry Chemistry is the scientific discipline involved with Chemical element, elements and chemical compound, compounds composed of atoms, molecules and ions: their composition, structure, properties, behavior and the changes they underg ...
s. Specifically it would be called a dimer if it contains two subunits, a trimer if it contains three subunits, a
tetramer A tetramer () (''wikt:tetra-, tetra-'', "four" + ''wikt:-mer, -mer'', "parts") is an oligomer formed from four monomers or Protein subunit, subunits. The associated property is called ''tetramery''. An example from inorganic chemistry is titanium ...

tetramer
if it contains four subunits, and a
pentamerA pentamer is an entity composed of five sub-units. In chemistry, it applies to molecules made of five monomers. In biochemistry, it applies to macromolecules, in particular to pentameric proteins, made of five proteic sub-units. In microbiology, ...
if it contains five subunits. The subunits are frequently related to one another by symmetry operations, such as a 2-fold axis in a dimer. Multimers made up of identical subunits are referred to with a prefix of "homo-" and those made up of different subunits are referred to with a prefix of "hetero-", for example, a heterotetramer, such as the two alpha and two beta chains of
hemoglobin Hemoglobin or haemoglobin (spelling differences Despite the various English dialects Dialect The term dialect (from Latin , , from the Ancient Greek word , , "discourse", from , , "through" and , , "I speak") is used in two distinct wa ...

hemoglobin
.


Domains, motifs, and folds in protein structure

Proteins are frequently described as consisting of several structural units. These units include domains, motifs, and folds. Despite the fact that there are about 100,000 different proteins expressed in
eukaryotic Eukaryotes () are organism In biology Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemistry, chemical processes, Molecular biology, molecular interact ...
systems, there are many fewer different domains, structural motifs and folds.


Structural domain

A
structural domain A protein domain is a region of the protein's polypeptide chain Peptides (from Greek language Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Ind ...
is an element of the protein's overall structure that is self-stabilizing and often
folds Benjamin Scott Folds (born September 12, 1966) is an American singer-songwriter, musician, composer and record producer. Folds was the frontman and pianist of the alternative rock band Ben Folds Five from 1993 to 2000, and again in the early 2010s ...

folds
independently of the rest of the protein chain. Many domains are not unique to the protein products of one
gene In biology Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemistry, chemical processes, Molecular biology, molecular interactions, Physiology, physiological mecha ...

gene
or one
gene family A gene family is a set of several similar genes, formed by duplication of a single original gene In biology Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemis ...
but instead appear in a variety of proteins. Domains often are named and singled out because they figure prominently in the biological function of the protein they belong to; for example, the "
calcium Calcium is a chemical element In chemistry, an element is a pure Chemical substance, substance consisting only of atoms that all have the same numbers of protons in their atomic nucleus, nuclei. Unlike chemical compounds, chemical elem ...

calcium
-binding domain of
calmodulin Calmodulin (CaM) (an abbreviation for calcium-modulated protein) is a multifunctional intermediate calcium-binding messenger protein expressed in all eukaryotic cells Eukaryotes () are organism In biology Biology is the na ...

calmodulin
". Because they are independently stable, domains can be "swapped" by
genetic engineering Genetic engineering, also called genetic modification or genetic manipulation, is the direct manipulation of an organism's gene In biology, a gene (from ''genos'' "...Wilhelm Johannsen coined the word gene to describe the Mendelian_in ...
between one protein and another to make chimera proteins. A conservative combination of several domains that occur in different proteins, such as
protein tyrosine phosphatase Protein tyrosine phosphatases are a group of enzymes that remove phosphate groups from phosphorylated tyrosine residues on proteins. Protein tyrosine (pTyr) phosphorylation is a common post-translational modification that can create novel recogniti ...
domain and
C2 domain A C2 domain is a protein Proteins are large biomolecule , showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, f ...
pair, was called "a superdomain" that may evolve as a single unit.


Structural and sequence motifs

The
structural A structure is an arrangement and organization of interrelated elements in a material object or system A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole. A sy ...
and
sequence motifs
sequence motifs
refer to short segments of protein three-dimensional structure or amino acid sequence that were found in a large number of different proteins


Supersecondary structure

The
supersecondary structure A supersecondary structure is a compact three-dimensional protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid Amino acids are organic compound , CH4; is among the simplest organic compounds. ...
refers to a specific combination of
secondary structure Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings. A molecule is an elect ...

secondary structure
elements, such as β-α-β units or a
helix-turn-helix In proteins, the helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the major groove of DNA. The HTH motif occurs ...
motif. Some of them may be also referred to as structural motifs.


Protein fold

A protein fold refers to the general protein architecture, like a
helix bundle A helix bundle is a small protein Proteins are large biomolecule , showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in ...
, β-barrel,
Rossmann fold The Rossmann fold is a tertiary fold found in protein Proteins are large biomolecule , showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir Jo ...
or different "folds" provided in the
Structural Classification of Proteins database The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their protein structure, structures and amino acid protein sequence, sequences. A motivation for ...
. A related concept is
protein topology 100px, Topology of beta-strands in "Greek-key" protein motif. Protein topology is a property of protein molecule that does not change under deformation (without cutting or breaking a bond). Two main topology frameworks have been developed and applie ...
.


Protein dynamics and conformational ensembles

Proteins are not static objects, but rather populate ensembles of conformational states. Transitions between these states typically occur on
nanoscale Image:Protein translation.gif, 300px, A ribosome is a biological machine that utilizes nanoscale protein dynamics The nanoscopic scale (or nanoscale) usually refers to structures with a length scale applicable to nanotechnology, usually cited ...
s, and have been linked to functionally relevant phenomena such as
allosteric signaling
allosteric signaling
and
enzyme catalysis Enzyme catalysis is the increase in the rate of a process A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic. Things called a process include: Business and managem ...

enzyme catalysis
.
Protein dynamics Proteins are generally thought to adopt unique structures determined by their amino acid sequences. However, proteins are not strictly static objects, but rather populate ensembles of (sometimes similar) conformations. Transitions between these stat ...
and
conformational change In biochemistry Biochemistry or biological chemistry, is the study of es within and relating to living s. A sub-discipline of both and , biochemistry may be divided into three fields: , and . Over the last decades of the 20th century, bi ...

conformational change
s allow proteins to function as nanoscale
biological machine A molecular machine, nanite, or nanomachine is a molecular component that produces quasi-mechanical movements (output) in response to specific stimuli (input). In cellular biology, macromolecular machines frequently perform tasks essential for l ...
s within cells, often in the form of multi-protein complexes. Examples include
motor proteins 300px, microtubule.html"_;"title="Kinesin_walking_on_a_microtubule">Kinesin_walking_on_a_microtubule_using_protein_dynamics_on_Nanoscopic_scale.html" ;"title="protein_dynamics.html" ;"title="microtubule.html" ;"title="Kinesin walking on a microtub ...
, such as
myosin Myosins () are a superfamily SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilie ...

myosin
, which is responsible for
muscle Skeletal muscles (commonly referred to as muscles) are organs An organ is a group of tissues with similar functions. Plant life and animal life rely on many organs that co-exist in organ systems. A given organ's tissues can be broadly cat ...

muscle
contraction,
kinesin A kinesin is a protein belonging to a class of motor protein 300px, microtubule.html"_;"title="Kinesin_walking_on_a_microtubule">Kinesin_walking_on_a_microtubule_using_protein_dynamics_on_Nanoscopic_scale.html" "title="protein_dynamics.html" ; ...

kinesin
, which moves cargo inside cells away from the
nucleus ''Nucleus'' (plural nuclei) is a Latin word for the seed inside a fruit. It most often refers to: *Atomic nucleus, the very dense central region of an atom *Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA ...

nucleus
along
microtubules Microtubules are polymer A polymer (; Greek ''poly- Poly, from the Greek :wikt:πολύς, πολύς meaning "many" or "much", may refer to: Businesses * China Poly Group Corporation, a Chinese business group, and its subsidiaries: ** Po ...

microtubules
, and
dynein Dynein is a family of cytoskeletal 300px, The eukaryotic cytoskeleton. Actin filaments are shown in red, and microtubules composed of beta tubulin are in green. The cytoskeleton is a complex, dynamic network of interlinking protein filaments pre ...

dynein
, which moves cargo inside cells towards the nucleus and produces the axonemal beating of motile cilia and
flagella A flagellum (; ) is a hairlike appendage that protrudes from a wide range of microorganism A microorganism, or microbe,, ''mikros'', "small") and ''organism In biology Biology is the natural science that studies life and ...

flagella
. " effect, the otile ciliumis a nanomachine composed of perhaps over 600 proteins in molecular complexes, many of which also function independently as nanomachines...
Flexible linker An intrinsically disordered protein (IDP) is a protein Proteins are large biomolecules or macromolecules that are comprised of one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions wi ...
s allow the mobile protein domains connected by them to recruit their binding partners and induce long-range
allostery In biochemistry Biochemistry or biological chemistry, is the study of es within and relating to living s. A sub-discipline of both and , biochemistry may be divided into three fields: , and . Over the last decades of the 20th century, ...

allostery
via protein domain dynamics. " Proteins are often thought of as relatively stable tertiary structures that experience conformational changes after being affected by interactions with other proteins or as a part of enzymatic activity. However, proteins may have varying degrees of stability, and some of the less stable variants are
intrinsically disordered proteins An intrinsically disordered protein (IDP) is a protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within o ...
. These proteins exist and function in a relatively 'disordered' state lacking a stable
tertiary structure Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings. A molecule is an elect ...
. As a result, they are difficult to describe by a single fixed
tertiary structure Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings. A molecule is an elect ...
. Conformational ensembles have been devised as a way to provide a more accurate and 'dynamic' representation of the conformational state of
intrinsically disordered proteins An intrinsically disordered protein (IDP) is a protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within o ...
. Protein
ensemble Ensemble may refer to: Art * Musical ensemble * Ensemble cast (drama, comedy) * Ensemble (musical theatre), also known as the chorus * Ensemble (band), a project of Olivier Alary * Ensemble (album), ''Ensemble'' (album), Kendji Girac 2015 album ...
files are a representation of a protein that can be considered to have a flexible structure. Creating these files requires determining which of the various theoretically possible protein conformations actually exist. One approach is to apply computational algorithms to the protein data in order to try to determine the most likely set of conformations for an
ensemble Ensemble may refer to: Art * Musical ensemble * Ensemble cast (drama, comedy) * Ensemble (musical theatre), also known as the chorus * Ensemble (band), a project of Olivier Alary * Ensemble (album), ''Ensemble'' (album), Kendji Girac 2015 album ...
file. There are multiple methods for preparing data for th
Protein Ensemble Database
that fall into two general methodologies – pool and molecular dynamics (MD) approaches (diagrammed in the figure). The pool based approach uses the protein’s amino acid sequence to create a massive pool of random conformations. This pool is then subjected to more computational processing that creates a set of theoretical parameters for each conformation based on the structure. Conformational subsets from this pool whose average theoretical parameters closely match known experimental data for this protein are selected. The alternative molecular dynamics approach takes multiple random conformations at a time and subjects all of them to experimental data. Here the experimental data is serving as limitations to be placed on the conformations (e.g. known distances between atoms). Only conformations that manage to remain within the limits set by the experimental data are accepted. This approach often applies large amounts of experimental data to the conformations which is a very computationally demanding task. The conformational ensembles were generated for a number of highly dynamic and partially unfolded proteins, such as
Sic1 Sic1, a protein Proteins are large biomolecule , showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, for which ...

Sic1
/
Cdc4CDC4 may refer to: * Cell division control protein 4 * Saint-Quentin Aerodrome {{Letter-NumberCombDisambig ...
, p15 PAF,
MKK7 Dual specificity mitogen-activated protein kinase kinase 7, also known as MAP kinase kinase 7 or MKK7, is an enzyme that in humans is encoded by the ''MAP2K7'' gene. This protein is a member of the MKK, mitogen-activated protein kinase kinase famil ...
, Beta-synuclein and CDKN1B, P27


Protein folding

As it is translated, polypeptides exit the
ribosome Ribosomes ( ), also called Palade granules, are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (mRNA translation). Ribosomes link amino acids ...

ribosome
mostly as a
random coil A random coil is a polymer A polymer (; Greek ''wikt:poly-, poly-'', "many" + ''wikt:-mer, -mer'', "part") is a Chemical substance, substance or material consisting of very large molecules, or macromolecules, composed of many Repeat unit, repe ...
and folds into its native state. The final structure of the protein chain is generally assumed to be determined by its amino acid sequence (Anfinsen's dogma).


Protein stability

Thermodynamic stability of proteins represents the Gibbs free energy, free energy difference between the folded and Denaturation (biochemistry), unfolded protein states. This free energy difference is very sensitive to temperature, hence a change in temperature may result in unfolding or denaturation. Denaturation (biochemistry), Protein denaturation may result in loss of function, and loss of native state. The free energy of stabilization of soluble globular proteins typically does not exceed 50 kJ/mol. Taking into consideration the large number of hydrogen bonds that take place for the stabilization of secondary structures, and the stabilization of the inner core through hydrophobic interactions, the free energy of stabilization emerges as small difference between large numbers.


Protein structure determination

Around 90% of the protein structures available in the Protein Data Bank have been determined by
X-ray crystallography X-ray crystallography (XRC) is the experimental science determining the atomic and molecular structure of a crystal A crystal or crystalline solid is a solid material whose constituents (such as atoms, molecules, or ions) are arranged in a ...

X-ray crystallography
. This method allows one to measure the three-dimensional (3-D) density distribution of electrons in the protein, in the crystallized state, and thereby infer the 3-D coordinates of all the atoms to be determined to a certain resolution. Roughly 9% of the known protein structures have been obtained by protein NMR, nuclear magnetic resonance (NMR) techniques. For larger protein complexes, cryo-electron microscopy can determine protein structures. The resolution is typically lower than that of X-ray crystallography, or NMR, but the maximum resolution is steadily increasing. This technique is still a particularly valuable for very large protein complexes such as virus coat proteins and amyloid fibers. General secondary structure composition can be determined via circular dichroism. Vibrational spectroscopy can also be used to characterize the conformation of peptides, polypeptides, and proteins. Two-dimensional infrared spectroscopy has become a valuable method to investigate the structures of flexible peptides and proteins that cannot be studied with other methods. A more qualitative picture of protein structure is often obtained by proteolysis, which is also useful to screen for more crystallizable protein samples. Novel implementations of this approach, including fast parallel proteolysis (FASTpp), can probe the structured fraction and its stability without the need for purification. Once a protein's structure has been experimentally determined, further detailed studies can be done computationally, using Molecular dynamics, molecular dynamic simulations of that structure.


Protein structure databases

A protein structure database is a database that is data modeling, modeled around the various #Protein structure determination, experimentally determined protein structures. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes 3D coordinates as well as experimental information, such as unit cell dimensions and angles for X-ray crystallography#Biological macromolecular crystallography, x-ray crystallography determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in computational biology such as Drug design#Structure based, structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.


Structural classifications of proteins

Protein structures can be grouped based on their structural similarity, circuit topology, topological class or a common evolutionary origin. The
Structural Classification of Proteins database The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their protein structure, structures and amino acid protein sequence, sequences. A motivation for ...
and CATH database provide two different structural classifications of proteins. When the structural similarity is large the two proteins have possibly diverged from a common ancestor, and shared structure between proteins is considered evidence of Homology (biology), homology. Structure similarity can then be used to group proteins together into protein superfamilies. If shared structure is significant but the fraction shared is small, the fragment shared may be the consequence of a more dramatic evolutionary event such as horizontal gene transfer, and joining proteins sharing these fragments into protein superfamilies is no longer justified. Topology of a protein can be used to classify proteins as well. Knot theory and circuit topology are two topology frameworks developed for classification of protein folds based on chain crossing and intrachain contacts respectively.


Computational prediction of protein structure

The generation of a protein sequence is much easier than the determination of a protein structure. However, the structure of a protein gives much more insight in the function of the protein than its sequence. Therefore, a number of methods for the computational prediction of protein structure from its sequence have been developed. ''Ab initio'' prediction methods use just the sequence of the protein. Threading (protein sequence), Threading and homology modeling methods can build a 3-D model for a protein of unknown structure from experimental structures of evolutionarily-related proteins, called a protein family.


See also

* Biomolecular structure * Gene structure * Nucleic acid structure * Ribbon diagram 3D schematic representation of proteins


References


Further reading


50 Years of Protein Structure Determination Timeline - HTML Version - National Institute of General Medical Sciences
at NIH


External links

* {{DEFAULTSORT:Protein Structure Protein structure,