Protein structure is the
three-dimensional arrangement of atoms in an
amino acid
Amino acids are organic compound
In , organic compounds are generally any s that contain - . Due to carbon's ability to (form chains with other carbon s), millions of organic compounds are known. The study of the properties, reactions, a ...

-chain
molecule
A molecule is an electrically
Electricity is the set of physical phenomena associated with the presence and motion
Image:Leaving Yongsan Station.jpg, 300px, Motion involves a change in position
In physics, motion is the phenomenon ...

.
Protein
Proteins are large biomolecule
, showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, for which they received a No ...

s are
polymer
A polymer (; Greek ''poly-
Poly, from the Greek :wikt:πολύς, πολύς meaning "many" or "much", may refer to:
Businesses
* China Poly Group Corporation, a Chinese business group, and its subsidiaries:
** Poly Property, a Hong Kong inc ...

s specifically
polypeptide
Peptides (from Greek language πεπτός, ''peptós'' "digested"; derived from πέσσειν, ''péssein'' "to digest") are short chains of amino acids linked by peptide bonds. Chains of fewer than ten or fifteen amino acids are called oligope ...
s formed from sequences of
amino acid
Amino acids are organic compound
In , organic compounds are generally any s that contain - . Due to carbon's ability to (form chains with other carbon s), millions of organic compounds are known. The study of the properties, reactions, a ...

s, the
monomer
In chemistry
Chemistry is the study of the properties and behavior of . It is a that covers the that make up matter to the composed of s, s and s: their composition, structure, properties, behavior and the changes they undergo during a ...

s of the polymer. A single amino acid monomer may also be called a ''residue'' indicating a repeating unit of a polymer. Proteins form by amino acids undergoing
condensation reaction
In , a condensation reaction is a type of in which two s are to form a single molecule, usually with the loss of a small molecule such as . If water is lost, the reaction is also known as a . However other molecules can also be lost, such as , , ...
s, in which the amino acids lose one
water molecule
Water () is a Chemical polarity, polar inorganic compound that is at room temperature a tasteless and odorless liquid, which is nearly colorless apart from Color of water, an inherent hint of blue. It is by far the most studied chemical compou ...

per
reaction
Reaction may refer to a process or to a response to an action, event, or exposure:
Physics and chemistry
*Chemical reaction
A chemical reaction is a process that leads to the IUPAC nomenclature for organic transformations, chemical transformat ...

in order to attach to one another with a
peptide bond
In organic chemistry, a peptide bond is an amide type of Covalent bond, covalent chemical bond linking two consecutive alpha-amino acids from C1 (carbon number one) of one alpha-amino acid and N2 (nitrogen number two) of another, along a peptide o ...

. By convention, a chain under 30 amino acids is often identified as a
peptide
Peptides (from Greek language
Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Indo-European family of languages, native to Greece, Cyprus, Albania, ...
, rather than a protein.
To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of
non-covalent interactionA non-covalent interaction differs from a covalent bond in that it does not involve the sharing of electrons, but rather involves more dispersed variations of electromagnetic interactions between molecules or within a molecule. The chemical energy re ...
s such as
hydrogen bonding
A hydrogen bond (or H-bond) is a primarily Electrostatics, electrostatic force of attraction between a hydrogen
Hydrogen is the chemical element
Image:Simple Periodic Table Chart-blocks.svg, 400px, Periodic table, The periodic table of ...
,
ionic interaction
Ionic bonding is a type of chemical bond
A chemical bond is a lasting attraction between atoms, ions or molecules that enables the formation of chemical compounds. The bond may result from the Coulomb's law, electrostatic force of attraction be ...
s,
Van der Waals forces
Microfiber cloth makes use of London-dispersion force to remove dirt without scratches.
In molecular physics, the van der Waals force, named after Dutch physicist Johannes Diderik van der Waals, is a distance-dependent interaction between atoms ...
, and
hydrophobic
In chemistry
Chemistry is the scientific
Science () is a systematic enterprise that builds and organizes knowledge
Knowledge is a familiarity or awareness, of someone or something, such as facts
A fact is an occurrence ...
packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their
three-dimensional structure. This is the topic of the scientific field of
structural biology, which employs techniques such as
X-ray crystallography
X-ray crystallography (XRC) is the experimental science determining the atomic and molecular structure of a crystal
A crystal or crystalline solid is a solid material whose constituents (such as atoms, molecules, or ions) are arranged in a ...

,
NMR spectroscopy
Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a technique to observe local magnetic fields around . The sample is placed in a magnetic field and the NMR signal is pr ...
,
cryo electron microscopy (cryo-EM) and
dual polarisation interferometry
Dual-polarization interferometry (DPI) is an analytical technique that probes molecular layers adsorbed to the surface of a waveguide using the evanescent wave of a laser beam. It is used to measure the conformational change in proteins, or othe ...
to determine the structure of proteins.
Protein structures range in size from tens to several thousand amino acids.
By physical size, proteins are classified as
nanoparticle
A nanoparticle or ultrafine particle is usually defined as a particle of matter
In classical physics and general chemistry, matter is any substance that has mass and takes up space by having volume. All everyday objects that can be touched ...

s, between 1–100 nm. Very large
protein complexes
is a protein complex functioning as a molecular biological machine. It uses protein dynamics#Global_flexibility:_multiple_domains , protein domain dynamics on Nanoscopic scale, nanoscales
A protein complex or multiprotein complex is a group of tw ...
can be formed from
protein subunit 274px, Rendering of HLA-A11 showing the α (A*1101 gene product) and β (Beta-2 microglobin) subunits. This receptor has a bound peptide (in the binding pocket) of heterologous origin that also contributes to function.
In structural biology, a p ...
s. For example, many thousands of
actin
Actin is a protein family, family of Globular protein, globular multi-functional proteins that form microfilaments. It is found in essentially all Eukaryote, eukaryotic cells, where it may be present at a concentration of over 100 Micromolar, μ ...
molecules assemble into a
microfilament Actin cytoskeleton of mouse embryo fibroblasts, stained with Fluorescein isothiocyanate-phalloidin, 250px
Microfilaments, also called actin filaments, are protein filaments in the cytoplasm of eukaryotic cell (biology), cells that form part of the ...
.
A protein usually undergoes
reversible in performing its biological function. The alternative structures of the same protein are referred to as different
conformations, and transitions between them are called
conformational change
In biochemistry
Biochemistry or biological chemistry, is the study of es within and relating to living s. A sub-discipline of both and , biochemistry may be divided into three fields: , and . Over the last decades of the 20th century, bi ...

s.
Levels of protein structure
There are four distinct levels of protein structure.
Primary structure
The
primary structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule
File:Pentacene on Ni(111) STM.jpg, A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon ...

of a protein refers to the sequence of
amino acid
Amino acids are organic compound
In , organic compounds are generally any s that contain - . Due to carbon's ability to (form chains with other carbon s), millions of organic compounds are known. The study of the properties, reactions, a ...

s in the polypeptide chain. The primary structure is held together by
peptide bonds
In organic chemistry
Organic chemistry is a branch of chemistry that studies the structure, properties and reactions of organic compounds, which contain carbon in covalent bonding.Clayden, J.; Greeves, N. and Warren, S. (2012) ''Organic Chemistr ...
that are made during the process of
protein biosynthesis
Protein biosynthesis (or protein synthesis) is a core biological process, occurring inside cells
Cell most often refers to:
* Cell (biology), the functional basic unit of life
Cell may also refer to:
Closed spaces
* Monastic cell, a small ro ...
. The two ends of the
polypeptide chain
Peptides (from Greek language
Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Indo-European family of languages, native to Greece, Cyprus, Albania, o ...
are referred to as the
carboxyl terminusImage:Tetrapeptide structural formulae v.1.png, 500px, A tetrapeptide (example: Valine, Val-Glycine, Gly-Serine, Ser-Alanine, Ala) with green highlighted ''N''-terminal α-amino acid (example: L-valine) and blue marked ''C''-terminal α-amino acid (e ...
(C-terminus) and the
amino terminus
350px, A Val-Glycine.html"_;"title="Valine.html"_;"title="tetrapeptide_(example:_Valine">Val-Glycine">Gly-Serine">Ser-Alanine.html" ;"title="Glycine">Gly-Serine.html" ;"title="Valine">Val-Glycine.html" ;"title="Valine.html" ;"title="tetrapeptid ...
(N-terminus) based on the nature of the free group on each extremity. Counting of residues always starts at the N-terminal end (NH
2-group), which is the end where the amino group is not involved in a peptide bond. The primary structure of a protein is determined by the
gene
In biology
Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemistry, chemical processes, Molecular biology, molecular interactions, Physiology, physiological mecha ...

corresponding to the protein. A specific sequence of
nucleotide
Nucleotides are organic molecules
, CH4; is among the simplest organic compounds.
In chemistry, organic compounds are generally any chemical compounds that contain carbon-hydrogen chemical bond, bonds. Due to carbon's ability to Catenation, ...

s in
DNA
Deoxyribonucleic acid (; DNA) is a molecule
File:Pentacene on Ni(111) STM.jpg, A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings.
A molecule is an electrically neutral gro ...

is
transcribed into
mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA i ...

, which is read by the
ribosome
Ribosomes ( ), also called Palade granules, are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (mRNA translation). Ribosomes link amino acids ...

in a process called
translation
Translation is the communication of the meaning
Meaning most commonly refers to:
* Meaning (linguistics), meaning which is communicated through the use of language
* Meaning (philosophy), definition, elements, and types of meaning discusse ...

. The sequence of amino acids in insulin was discovered by
Frederick Sanger
Frederick Sanger (; 13 August 1918 – 19 November 2013) was a British biochemist
Biochemists are scientists who are trained in biochemistry
Biochemistry or biological chemistry, is the study of chemical processes within and relatin ...

, establishing that proteins have defining amino acid sequences. The sequence of a protein is unique to that protein, and defines the structure and function of the protein. The sequence of a protein can be determined by methods such as
Edman degradationEdman degradation, developed by Pehr Edman, is a method of sequencing amino acids
Amino acids are organic compounds that contain amino (–NH2) and Carboxylic acid, carboxyl (–COOH) functional groups, along with a Substituent, side chain (R ...
or
tandem mass spectrometry
time-of-flight hybrid tandem mass spectrometer.
Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental analysis where two or more mass analyzers are coupled together using an additional reaction step to increase the ...
. Often, however, it is read directly from the sequence of the gene using the
genetic code
The genetic code is the set of rules used by living cells
Cell most often refers to:
* Cell (biology), the functional basic unit of life
Cell may also refer to:
Closed spaces
* Monastic cell, a small room, hut, or cave in which a monk or rel ...

. It is strictly recommended to use the words "amino acid residues" when discussing proteins because when a peptide bond is formed, a
water molecule
Water () is a Chemical polarity, polar inorganic compound that is at room temperature a tasteless and odorless liquid, which is nearly colorless apart from Color of water, an inherent hint of blue. It is by far the most studied chemical compou ...

is lost, and therefore proteins are made up of amino acid residues.
Post-translational modification
Post-translational modification (PTM) refers to the covalent and generally enzyme, enzymatic modification of proteins following protein biosynthesis. Proteins are synthesized by ribosomes translation (biology), translating mRNA into polypeptide c ...
s such as
phosphorylation
In chemistry
Chemistry is the study of the properties and behavior of . It is a that covers the that make up matter to the composed of s, s and s: their composition, structure, properties, behavior and the changes they undergo during ...

s and
glycosylation
Glycosylation (see also chemical glycosylationA chemical glycosylation reaction involves the coupling of a glycosyl donor, to a glycosyl acceptor forming a glycoside. If both the donor and acceptor are sugars, then the product is an oligosacchar ...

s are usually also considered a part of the primary structure, and cannot be read from the gene. For example,
insulin
Insulin (, from Latin ''insula'', 'island') is a peptide hormone produced by beta cells of the pancreatic islets; it is considered to be the main Anabolism, anabolic hormone of the body. It regulates the metabolism of carbohydrates, fats and p ...

is composed of 51 amino acids in 2 chains. One chain has 31 amino acids, and the other has 20 amino acids.
Secondary structure
Secondary structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule
A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings.
A molecule is an elect ...

refers to highly regular local sub-structures on the actual polypeptide backbone chain. Two main types of secondary structure, the
α-helix
The alpha helix (α-helix) is a common structural motif, motif in the Protein secondary structure, secondary structure of proteins and is a Screw thread#Handedness, right hand-helix conformation in which every backbone amino, N−H group hydrogen b ...

and the
or
β-sheet
The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone chain, backbone hydrogen bonds, ...
s, were suggested in 1951 by
Linus Pauling
Linus Carl Pauling (; February 28, 1901 – August 19, 1994) was an American chemist, biochemist, chemical engineer, peace activist, author, and educator. He published more than 1,200 papers and books, of which about 850 dealt with scientific t ...

et al.
These secondary structures are defined by patterns of
hydrogen bonds
A hydrogen bond (or H-bond) is a primarily Electrostatics, electrostatic force of attraction between a hydrogen
Hydrogen is the chemical element
Image:Simple Periodic Table Chart-blocks.svg, 400px, Periodic table, The periodic table of ...
between the main-chain peptide groups. They have a regular geometry, being constrained to specific values of the dihedral angles ψ and φ on the
Ramachandran plot
In biochemistry, a Ramachandran plot (also known as a Rama plot, a Ramachandran diagram or a ,ψplot), originally developed in 1963 by Gopalasamudram Narayana Ramachandran, G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan, is a way to ...
. Both the α-helix and the β-sheet represent a way of saturating all the hydrogen bond donors and acceptors in the peptide backbone. Some parts of the protein are ordered but do not form any regular structures. They should not be confused with
random coil A random coil is a polymer
A polymer (; Greek ''wikt:poly-, poly-'', "many" + ''wikt:-mer, -mer'', "part")
is a Chemical substance, substance or material consisting of very large molecules, or macromolecules, composed of many Repeat unit, repe ...
, an unfolded polypeptide chain lacking any fixed three-dimensional structure. Several sequential secondary structures may form a "
supersecondary unit".
Tertiary structure
Tertiary structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule
A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings.
A molecule is an elect ...

refers to the three-dimensional structure created by a single protein molecule (a single
polypeptide chain
Peptides (from Greek language
Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Indo-European family of languages, native to Greece, Cyprus, Albania, o ...
). It may include
one or several domains. The α-helixes and β-pleated-sheets are folded into a compact
globular structure. The folding is driven by the ''non-specific''
hydrophobic interactions
thumbnail, 250px, A droplet of water forms a spherical shape, minimizing contact with the hydrophobic leaf.
The hydrophobic effect is the observed tendency of nonpolar substances to aggregate in an aqueous solution and exclude water#Chemical and p ...
, the burial of
hydrophobic residues from
water
Water (chemical formula H2O) is an Inorganic compound, inorganic, transparent, tasteless, odorless, and Color of water, nearly colorless chemical substance, which is the main constituent of Earth's hydrosphere and the fluids of all known li ...

, but the structure is stable only when the parts of a
protein domain
A protein domain is a region of the protein's Peptide, polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact protein folding, folded three-dimensional structure. Many proteins consist ...
are locked into place by ''specific'' tertiary interactions, such as
salt bridges, hydrogen bonds, and the tight packing of side chains and
disulfide bond
In biochemistry
Biochemistry or biological chemistry, is the study of chemical process
In a scientific
Science (from the Latin
Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-Europe ...
s. The disulfide bonds are extremely rare in cytosolic proteins, since the
cytosol
The cytosol, also known as cytoplasmic matrix or groundplasm, is one of the liquids found inside cells
Cell most often refers to:
* Cell (biology), the functional basic unit of life
Cell may also refer to:
Closed spaces
* Monastic cell, a s ...
(intracellular fluid) is generally a
environment.
Quaternary structure
Quaternary structure is the three-dimensional structure consisting of the aggregation of two or more individual polypeptide chains (subunits) that operate as a single functional unit (
multimer
In chemistry
Chemistry is the scientific discipline involved with Chemical element, elements and chemical compound, compounds composed of atoms, molecules and ions: their composition, structure, properties, behavior and the changes they underg ...
). The resulting multimer is stabilized by the same
non-covalent interactionA non-covalent interaction differs from a covalent bond in that it does not involve the sharing of electrons, but rather involves more dispersed variations of electromagnetic interactions between molecules or within a molecule. The chemical energy re ...
s and disulfide bonds as in tertiary structure. There are many possible quaternary structure organisations.
Complexes of two or more polypeptides (i.e. multiple subunits) are called
multimer
In chemistry
Chemistry is the scientific discipline involved with Chemical element, elements and chemical compound, compounds composed of atoms, molecules and ions: their composition, structure, properties, behavior and the changes they underg ...
s. Specifically it would be called a
dimer if it contains two subunits, a
trimer if it contains three subunits, a
tetramer
A tetramer () (''wikt:tetra-, tetra-'', "four" + ''wikt:-mer, -mer'', "parts") is an oligomer formed from four monomers or Protein subunit, subunits. The associated property is called ''tetramery''. An example from inorganic chemistry is titanium ...

if it contains four subunits, and a
pentamerA pentamer is an entity composed of five sub-units.
In chemistry, it applies to molecules made of five monomers.
In biochemistry, it applies to macromolecules, in particular to pentameric proteins, made of five proteic sub-units.
In microbiology, ...
if it contains five subunits. The subunits are frequently related to one another by
symmetry operations, such as a 2-fold axis in a dimer. Multimers made up of identical subunits are referred to with a prefix of "homo-" and those made up of different subunits are referred to with a prefix of "hetero-", for example, a heterotetramer, such as the two alpha and two beta chains of
hemoglobin
Hemoglobin or haemoglobin (spelling differences
Despite the various English dialects
Dialect
The term dialect (from Latin , , from the Ancient Greek word , , "discourse", from , , "through" and , , "I speak") is used in two distinct wa ...

.
Domains, motifs, and folds in protein structure

Proteins are frequently described as consisting of several structural units. These units include domains,
motifs, and folds. Despite the fact that there are about 100,000 different proteins expressed in
eukaryotic
Eukaryotes () are organism
In biology
Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemistry, chemical processes, Molecular biology, molecular interact ...
systems, there are many fewer different domains, structural motifs and folds.
Structural domain
A
structural domain
A protein domain is a region of the protein's polypeptide chain
Peptides (from Greek language
Greek (modern , romanized: ''Elliniká'', Ancient Greek, ancient , ''Hellēnikḗ'') is an independent branch of the Indo-European languages, Ind ...
is an element of the protein's overall structure that is self-stabilizing and often
folds
Benjamin Scott Folds (born September 12, 1966) is an American singer-songwriter, musician, composer and record producer. Folds was the frontman and pianist of the alternative rock band Ben Folds Five from 1993 to 2000, and again in the early 2010s ...

independently of the rest of the protein chain. Many domains are not unique to the protein products of one
gene
In biology
Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemistry, chemical processes, Molecular biology, molecular interactions, Physiology, physiological mecha ...

or one
gene family
A gene family is a set of several similar genes, formed by duplication of a single original gene
In biology
Biology is the natural science that studies life and living organisms, including their anatomy, physical structure, Biochemis ...
but instead appear in a variety of proteins. Domains often are named and singled out because they figure prominently in the biological function of the protein they belong to; for example, the "
calcium
Calcium is a chemical element
In chemistry, an element is a pure Chemical substance, substance consisting only of atoms that all have the same numbers of protons in their atomic nucleus, nuclei. Unlike chemical compounds, chemical elem ...

-binding domain of
calmodulin
Calmodulin (CaM) (an abbreviation for calcium-modulated protein) is a multifunctional intermediate calcium-binding messenger protein expressed in all eukaryotic cells
Eukaryotes () are organism
In biology
Biology is the na ...

". Because they are independently stable, domains can be "swapped" by
genetic engineering
Genetic engineering, also called genetic modification or genetic manipulation, is the direct manipulation of an organism's gene
In biology, a gene (from ''genos'' "...Wilhelm Johannsen coined the word gene to describe the Mendelian_in ...
between one protein and another to make
chimera proteins. A conservative combination of several domains that occur in different proteins, such as
protein tyrosine phosphatase
Protein tyrosine phosphatases are a group of enzymes that remove phosphate groups from phosphorylated tyrosine residues on proteins. Protein tyrosine (pTyr) phosphorylation is a common post-translational modification that can create novel recogniti ...
domain and
C2 domain
A C2 domain is a protein
Proteins are large biomolecule
, showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, f ...
pair, was called "a superdomain" that may evolve as a single unit.
Structural and sequence motifs
The
structural
A structure is an arrangement and organization of interrelated elements in a material object or system
A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole.
A sy ...
and
refer to short segments of protein three-dimensional structure or amino acid sequence that were found in a large number of different proteins
Supersecondary structure
The
supersecondary structure
A supersecondary structure is a compact three-dimensional protein structure
Protein structure is the three-dimensional arrangement of atoms in an amino acid
Amino acids are organic compound
, CH4; is among the simplest organic compounds. ...
refers to a specific combination of
secondary structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule
A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings.
A molecule is an elect ...

elements, such as β-α-β units or a
helix-turn-helix
In proteins, the helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the major groove of DNA. The HTH motif occurs ...
motif. Some of them may be also referred to as structural motifs.
Protein fold
A protein fold refers to the general protein architecture, like a
helix bundle A helix bundle is a small protein
Proteins are large biomolecule
, showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in ...
,
β-barrel,
Rossmann fold
The Rossmann fold is a tertiary fold found in protein
Proteins are large biomolecule
, showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir Jo ...
or different "folds" provided in the
Structural Classification of Proteins database
The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their protein structure, structures and amino acid protein sequence, sequences. A motivation for ...
.
A related concept is
protein topology 100px, Topology of beta-strands in "Greek-key" protein motif.
Protein topology is a property of protein molecule that does not change under deformation (without cutting or breaking a bond). Two main topology frameworks have been developed and applie ...
.
Protein dynamics and conformational ensembles
Proteins are not static objects, but rather populate ensembles of
conformational states. Transitions between these states typically occur on
nanoscale
Image:Protein translation.gif, 300px, A ribosome is a biological machine that utilizes nanoscale protein dynamics
The nanoscopic scale (or nanoscale) usually refers to structures with a length scale applicable to nanotechnology, usually cited ...
s, and have been linked to functionally relevant phenomena such as
and
enzyme catalysis
Enzyme catalysis is the increase in the rate of a process
A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic.
Things called a process include:
Business and managem ...

.
Protein dynamics Proteins are generally thought to adopt unique structures determined by their amino acid sequences. However, proteins are not strictly static objects, but rather populate ensembles of (sometimes similar) conformations. Transitions between these stat ...
and
conformational change
In biochemistry
Biochemistry or biological chemistry, is the study of es within and relating to living s. A sub-discipline of both and , biochemistry may be divided into three fields: , and . Over the last decades of the 20th century, bi ...

s allow proteins to function as nanoscale
biological machine
A molecular machine, nanite, or nanomachine is a molecular component that produces quasi-mechanical movements (output) in response to specific stimuli (input). In cellular biology, macromolecular machines frequently perform tasks essential for l ...
s within cells, often in the form of
multi-protein complexes. Examples include
motor proteins 300px, microtubule.html"_;"title="Kinesin_walking_on_a_microtubule">Kinesin_walking_on_a_microtubule_using_protein_dynamics_on_Nanoscopic_scale.html" ;"title="protein_dynamics.html" ;"title="microtubule.html" ;"title="Kinesin walking on a microtub ...
, such as
myosin
Myosins () are a superfamily
SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilie ...

, which is responsible for
muscle
Skeletal muscles (commonly referred to as muscles) are organs
An organ is a group of tissues with similar functions. Plant life and animal life rely on many organs that co-exist in organ systems.
A given organ's tissues can be broadly cat ...

contraction,
kinesin
A kinesin is a protein belonging to a class of motor protein 300px, microtubule.html"_;"title="Kinesin_walking_on_a_microtubule">Kinesin_walking_on_a_microtubule_using_protein_dynamics_on_Nanoscopic_scale.html" "title="protein_dynamics.html" ; ...

, which moves cargo inside cells away from the
nucleus
''Nucleus'' (plural nuclei) is a Latin word for the seed inside a fruit. It most often refers to:
*Atomic nucleus, the very dense central region of an atom
*Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA ...

along
microtubules
Microtubules are polymer
A polymer (; Greek ''poly-
Poly, from the Greek :wikt:πολύς, πολύς meaning "many" or "much", may refer to:
Businesses
* China Poly Group Corporation, a Chinese business group, and its subsidiaries:
** Po ...

, and
dynein
Dynein is a family of cytoskeletal
300px, The eukaryotic cytoskeleton. Actin filaments are shown in red, and microtubules composed of beta tubulin are in green.
The cytoskeleton is a complex, dynamic network of interlinking protein filaments pre ...

, which moves cargo inside cells towards the nucleus and produces the axonemal beating of
motile cilia and
flagella
A flagellum (; ) is a hairlike appendage that protrudes from a wide range of microorganism
A microorganism, or microbe,, ''mikros'', "small") and ''organism
In biology
Biology is the natural science that studies life and ...

. "
effect, the
otile ciliumis a nanomachine composed of perhaps over 600 proteins in molecular complexes, many of which also function independently as nanomachines...
Flexible linker
An intrinsically disordered protein (IDP) is a protein
Proteins are large biomolecules or macromolecules that are comprised of one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions wi ...
s allow the
mobile protein domains connected by them to recruit their binding partners and induce long-range
allostery
In biochemistry
Biochemistry or biological chemistry, is the study of es within and relating to living s. A sub-discipline of both and , biochemistry may be divided into three fields: , and . Over the last decades of the 20th century, ...

via
protein domain dynamics. "

Proteins are often thought of as relatively stable
tertiary structures that experience conformational changes after being affected by interactions with other proteins or as a part of enzymatic activity. However, proteins may have varying degrees of stability, and some of the less stable variants are
intrinsically disordered proteins
An intrinsically disordered protein (IDP) is a protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within o ...
. These proteins exist and function in a relatively 'disordered' state lacking a stable
tertiary structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule
A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings.
A molecule is an elect ...
. As a result, they are difficult to describe by a single fixed
tertiary structure
Biomolecular structure is the intricate folded, three-dimensional shape that is formed by a molecule
A scanning tunneling microscopy image of pentacene molecules, which consist of linear chains of five carbon rings.
A molecule is an elect ...
.
Conformational ensembles have been devised as a way to provide a more accurate and 'dynamic' representation of the conformational state of
intrinsically disordered proteins
An intrinsically disordered protein (IDP) is a protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within o ...
.
Protein
ensemble
Ensemble may refer to:
Art
* Musical ensemble
* Ensemble cast (drama, comedy)
* Ensemble (musical theatre), also known as the chorus
* Ensemble (band), a project of Olivier Alary
* Ensemble (album), ''Ensemble'' (album), Kendji Girac 2015 album ...
files are a representation of a protein that can be considered to have a flexible structure. Creating these files requires determining which of the various theoretically possible protein conformations actually exist. One approach is to apply computational algorithms to the protein data in order to try to determine the most likely set of conformations for an
ensemble
Ensemble may refer to:
Art
* Musical ensemble
* Ensemble cast (drama, comedy)
* Ensemble (musical theatre), also known as the chorus
* Ensemble (band), a project of Olivier Alary
* Ensemble (album), ''Ensemble'' (album), Kendji Girac 2015 album ...
file. There are multiple methods for preparing data for th
Protein Ensemble Databasethat fall into two general methodologies – pool and molecular dynamics (MD) approaches (diagrammed in the figure). The pool based approach uses the protein’s amino acid sequence to create a massive pool of random conformations. This pool is then subjected to more computational processing that creates a set of theoretical parameters for each conformation based on the structure. Conformational subsets from this pool whose average theoretical parameters closely match known experimental data for this protein are selected. The alternative molecular dynamics approach takes multiple random conformations at a time and subjects all of them to experimental data. Here the experimental data is serving as limitations to be placed on the conformations (e.g. known distances between atoms). Only conformations that manage to remain within the limits set by the experimental data are accepted. This approach often applies large amounts of experimental data to the conformations which is a very computationally demanding task.
The conformational ensembles were generated for a number of highly dynamic and partially unfolded proteins, such as
Sic1
Sic1, a protein
Proteins are large biomolecule
, showing alpha helices, represented by ribbons. This poten was the first to have its suckture solved by X-ray crystallography by Max Perutz and Sir John Cowdery Kendrew in 1958, for which ...

/
Cdc4CDC4 may refer to:
* Cell division control protein 4
* Saint-Quentin Aerodrome
{{Letter-NumberCombDisambig ...
,
p15 PAF,
MKK7
Dual specificity mitogen-activated protein kinase kinase 7, also known as MAP kinase kinase 7 or MKK7, is an enzyme that in humans is encoded by the ''MAP2K7'' gene. This protein is a member of the MKK, mitogen-activated protein kinase kinase famil ...
,
Beta-synuclein and CDKN1B, P27
Protein folding
As it is translated, polypeptides exit the
ribosome
Ribosomes ( ), also called Palade granules, are molecular machine, macromolecular machines, found within all cell (biology), cells, that perform Translation (biology), biological protein synthesis (mRNA translation). Ribosomes link amino acids ...

mostly as a
random coil A random coil is a polymer
A polymer (; Greek ''wikt:poly-, poly-'', "many" + ''wikt:-mer, -mer'', "part")
is a Chemical substance, substance or material consisting of very large molecules, or macromolecules, composed of many Repeat unit, repe ...
and folds into its native state.
The final structure of the protein chain is generally assumed to be determined by its amino acid sequence (Anfinsen's dogma).
Protein stability
Thermodynamic stability of proteins represents the Gibbs free energy, free energy difference between the folded and Denaturation (biochemistry), unfolded protein states. This free energy difference is very sensitive to temperature, hence a change in temperature may result in unfolding or denaturation. Denaturation (biochemistry), Protein denaturation may result in loss of function, and loss of native state. The free energy of stabilization of soluble globular proteins typically does not exceed 50 kJ/mol. Taking into consideration the large number of hydrogen bonds that take place for the stabilization of secondary structures, and the stabilization of the inner core through hydrophobic interactions, the free energy of stabilization emerges as small difference between large numbers.
Protein structure determination

Around 90% of the protein structures available in the Protein Data Bank have been determined by
X-ray crystallography
X-ray crystallography (XRC) is the experimental science determining the atomic and molecular structure of a crystal
A crystal or crystalline solid is a solid material whose constituents (such as atoms, molecules, or ions) are arranged in a ...

. This method allows one to measure the three-dimensional (3-D) density distribution of electrons in the protein, in the crystallized state, and thereby infer the 3-D coordinates of all the atoms to be determined to a certain resolution. Roughly 9% of the known protein structures have been obtained by protein NMR, nuclear magnetic resonance (NMR) techniques. For larger protein complexes, cryo-electron microscopy can determine protein structures. The resolution is typically lower than that of X-ray crystallography, or NMR, but the maximum resolution is steadily increasing. This technique is still a particularly valuable for very large protein complexes such as virus coat proteins and amyloid fibers.
General secondary structure composition can be determined via circular dichroism. Vibrational spectroscopy can also be used to characterize the conformation of peptides, polypeptides, and proteins.
Two-dimensional infrared spectroscopy has become a valuable method to investigate the structures of flexible peptides and proteins that cannot be studied with other methods. A more qualitative picture of protein structure is often obtained by proteolysis, which is also useful to screen for more crystallizable protein samples. Novel implementations of this approach, including fast parallel proteolysis (FASTpp), can probe the structured fraction and its stability without the need for purification.
Once a protein's structure has been experimentally determined, further detailed studies can be done computationally, using Molecular dynamics, molecular dynamic simulations of that structure.
Protein structure databases
A protein structure database is a database that is data modeling, modeled around the various #Protein structure determination, experimentally determined protein structures. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes 3D coordinates as well as experimental information, such as unit cell dimensions and angles for X-ray crystallography#Biological macromolecular crystallography, x-ray crystallography determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in computational biology such as Drug design#Structure based, structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.
Structural classifications of proteins
Protein structures can be grouped based on their structural similarity, circuit topology, topological class or a common evolutionary origin. The
Structural Classification of Proteins database
The Structural Classification of Proteins (SCOP) database is a largely manual classification of protein structural domains based on similarities of their protein structure, structures and amino acid protein sequence, sequences. A motivation for ...
and CATH database
provide two different structural classifications of proteins. When the structural similarity is large the two proteins have possibly diverged from a common ancestor,
and shared structure between proteins is considered evidence of Homology (biology), homology. Structure similarity can then be used to group proteins together into protein superfamilies.
If shared structure is significant but the fraction shared is small, the fragment shared may be the consequence of a more dramatic evolutionary event such as horizontal gene transfer, and joining proteins sharing these fragments into protein superfamilies is no longer justified.
Topology of a protein can be used to classify proteins as well. Knot theory and circuit topology are two topology frameworks developed for classification of protein folds based on chain crossing and intrachain contacts respectively.
Computational prediction of protein structure
The generation of a protein sequence is much easier than the determination of a protein structure. However, the structure of a protein gives much more insight in the function of the protein than its sequence. Therefore, a number of methods for the computational prediction of protein structure from its sequence have been developed.
''Ab initio'' prediction methods use just the sequence of the protein. Threading (protein sequence), Threading and homology modeling methods can build a 3-D model for a protein of unknown structure from experimental structures of evolutionarily-related proteins, called a protein family.
See also
* Biomolecular structure
* Gene structure
* Nucleic acid structure
* Ribbon diagram 3D schematic representation of proteins
References
Further reading
50 Years of Protein Structure Determination Timeline - HTML Version - National Institute of General Medical Sciencesat NIH
External links
*
{{DEFAULTSORT:Protein Structure
Protein structure,