DNA-binding proteins are
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s that have
DNA-binding domain
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a gener ...
s and thus have a specific or general affinity for
single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the
major groove of
B-DNA, because it exposes more
functional group
In organic chemistry, a functional group is any substituent or moiety (chemistry), moiety in a molecule that causes the molecule's characteristic chemical reactions. The same functional group will undergo the same or similar chemical reactions r ...
s that identify a
base pair
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
.
Examples
DNA-binding
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s include
transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
s which
modulate the process of transcription, various
polymerase
In biochemistry, a polymerase is an enzyme (Enzyme Commission number, EC 2.7.7.6/7/19/48/49) that synthesizes long chains of polymers or nucleic acids. DNA polymerase and RNA polymerase are used to assemble DNA and RNA molecules, respectively, by ...
s,
nuclease
In biochemistry, a nuclease (also archaically known as nucleodepolymerase or polynucleotidase) is an enzyme capable of cleaving the phosphodiester bonds that link nucleotides together to form nucleic acids. Nucleases variously affect single and ...
s which cleave DNA molecules, and
histone
In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
s which are involved in
chromosome
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
packaging and transcription in the
cell nucleus
The cell nucleus (; : nuclei) is a membrane-bound organelle found in eukaryote, eukaryotic cell (biology), cells. Eukaryotic cells usually have a single nucleus, but a few cell types, such as mammalian red blood cells, have #Anucleated_cells, ...
. DNA-binding proteins can incorporate such domains as the
zinc finger
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a ...
, the
helix-turn-helix
Helix-turn-helix is a DNA-binding domain (DBD). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the majo ...
, and the
leucine zipper
A leucine zipper (or leucine scissors) is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amin ...
(among many others) that facilitate binding to nucleic acid. There are also more unusual examples such as
transcription activator like effectors.
Non-specific DNA-protein interactions
Structural proteins that bind DNA are well-understood examples of non-specific DNA-protein interactions. Within chromosomes, DNA is held in complexes with structural proteins. These proteins organize the DNA into a compact structure called
chromatin
Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important r ...
. In
eukaryote
The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s, this structure involves DNA binding to a complex of small basic proteins called
histone
In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes ...
s. In
prokaryote
A prokaryote (; less commonly spelled procaryote) is a unicellular organism, single-celled organism whose cell (biology), cell lacks a cell nucleus, nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Gree ...
s, multiple types of proteins are involved. The histones form a disk-shaped complex called a
nucleosome
A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone, histone proteins and resembles thread wrapped around a bobbin, spool. The nucleosome ...
, which contains two complete turns of double-stranded DNA wrapped around its surface. These non-specific interactions are formed through basic residues in the histones making
ionic bond
Ionic bonding is a type of chemical bond
A chemical bond is the association of atoms or ions to form molecules, crystals, and other structures. The bond may result from the electrostatic force between oppositely charged ions as in ionic ...
s to the acidic sugar-phosphate backbone of the DNA, and are therefore largely independent of the base sequence.
Chemical
A chemical substance is a unique form of matter with constant chemical composition and characteristic properties. Chemical substances may take the form of a single element or chemical compounds. If two or more chemical substances can be combin ...
modifications of these basic
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
residues include
methylation
Methylation, in the chemistry, chemical sciences, is the addition of a methyl group on a substrate (chemistry), substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replac ...
,
phosphorylation
In biochemistry, phosphorylation is described as the "transfer of a phosphate group" from a donor to an acceptor. A common phosphorylating agent (phosphate donor) is ATP and a common family of acceptor are alcohols:
:
This equation can be writ ...
and
acetylation
:
In chemistry, acetylation is an organic esterification reaction with acetic acid. It introduces an acetyl group into a chemical compound. Such compounds are termed ''acetate esters'' or simply ''acetates''. Deacetylation is the opposite react ...
. These chemical changes alter the strength of the interaction between the DNA and the histones, making the DNA more or less accessible to
transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
s and changing the rate of transcription. Other non-specific DNA-binding proteins in chromatin include the high-mobility group (HMG) proteins, which bind to bent or distorted DNA. Biophysical studies show that these architectural HMG proteins bind, bend and loop DNA to perform its biological functions. These proteins are important in bending arrays of nucleosomes and arranging them into the larger structures that form chromosomes. Recently FK506 binding protein 25 (FBP25) was also shown to non-specifically bind to DNA which helps in DNA repair.
Proteins that specifically bind single-stranded DNA
A distinct group of DNA-binding proteins are the DNA-binding proteins that specifically bind single-stranded DNA. In humans,
replication protein A is the best-understood member of this family and is used in processes where the double helix is separated, including DNA replication, recombination and DNA repair. These binding proteins seem to stabilize single-stranded DNA and protect it from forming
stem-loop
Stem-loops are nucleic acid Biomolecular structure, secondary structural elements which form via intramolecular base pairing in single-stranded DNA or RNA. They are also referred to as hairpins or hairpin loops. A stem-loop occurs when two regi ...
s or being degraded by
nuclease
In biochemistry, a nuclease (also archaically known as nucleodepolymerase or polynucleotidase) is an enzyme capable of cleaving the phosphodiester bonds that link nucleotides together to form nucleic acids. Nucleases variously affect single and ...
s.
Binding to specific DNA sequences

In contrast, other proteins have evolved to bind to specific DNA sequences. The most intensively studied of these are the various
transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
s, which are proteins that regulate transcription. Each transcription factor binds to one specific set of DNA sequences and activates or inhibits the transcription of genes that have these sequences near their promoters. The transcription factors do this in two ways. Firstly, they can bind the RNA polymerase responsible for transcription, either directly or through other mediator proteins; this locates the polymerase at the promoter and allows it to begin transcription. Alternatively, transcription factors can bind
enzyme
An enzyme () is a protein that acts as a biological catalyst by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different mol ...
s that modify the histones at the promoter. This alters the accessibility of the DNA template to the polymerase.
These DNA targets can occur throughout an organism's genome. Thus, changes in the activity of one type of transcription factor can affect thousands of genes. Thus, these proteins are often the targets of the
signal transduction
Signal transduction is the process by which a chemical or physical signal is transmitted through a cell as a biochemical cascade, series of molecular events. Proteins responsible for detecting stimuli are generally termed receptor (biology), rece ...
processes that control responses to environmental changes or
cellular differentiation
Cellular differentiation is the process in which a stem cell changes from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellula ...
and development. The specificity of these transcription factors' interactions with DNA come from the proteins making multiple contacts to the edges of the DNA bases, allowing them to ''read'' the DNA sequence. Most of these base-interactions are made in the major groove, where the bases are most accessible. Mathematical descriptions of protein-DNA binding taking into account sequence-specificity, and competitive and cooperative binding of proteins of different types are usually performed with the help of the
lattice models
Lattice may refer to:
Arts and design
* Latticework, an ornamental criss-crossed framework, an arrangement of crossing laths or other thin strips of material
* Lattice (music), an organized grid model of pitch ratios
* Lattice (pastry), an ...
. Computational methods to identify the DNA binding sequence specificity have been proposed to make a good use of the abundant sequence data in the post-genomic era. In addition, progress has happened on structure-based prediction of binding specificity across protein families using deep learning.
Protein–DNA interactions
Protein–DNA interactions occur when a
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
binds a molecule of
DNA
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
, often to regulate the
biological function of DNA, usually the
expression of a
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
. Among the proteins that bind to DNA are
transcription factors
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The fun ...
that activate or repress gene expression by binding to DNA motifs and
histones
In biology, histones are highly Base (chemistry), basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaea, Archaeal Phylum, phyla. They act as spools around which DNA winds to create st ...
that form part of the structure of DNA and bind to it less specifically. Also proteins that
repair DNA such as
uracil-DNA glycosylase interact closely with it.
In general, proteins bind to DNA in the
major groove; however, there are exceptions.
Protein–DNA interaction are of mainly two types, either specific interaction, or non-specific interaction. Recent single-molecule experiments showed that DNA binding proteins undergo of rapid rebinding in order to bind in correct orientation for recognizing the target site.
Design
Designing DNA-binding proteins that have a specified DNA-binding site has been an important goal for biotechnology.
Zinc finger
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a ...
proteins have been designed to bind to specific DNA sequences and this is the basis of
zinc finger nucleases. Recently
transcription activator-like effector nucleases (TALENs) have been created which are based on natural
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s secreted by ''
Xanthomonas
''Xanthomonas'' (from greek: ''xanthos'' – "yellow"; ''monas'' – "entity") is a genus of bacteria, many of which cause plant pathology, plant diseases. There are at least 27 plant associated ''Xanthomonas spp.'', that all together infect at l ...
'' bacteria via their
type III secretion system when they infect various
plant
Plants are the eukaryotes that form the Kingdom (biology), kingdom Plantae; they are predominantly Photosynthesis, photosynthetic. This means that they obtain their energy from sunlight, using chloroplasts derived from endosymbiosis with c ...
species.
Detection methods
There are many ''in vitro'' and ''in vivo'' techniques which are useful in detecting DNA-Protein Interactions. The following lists some methods currently in use:
Electrophoretic mobility shift assay (EMSA) is a widespread qualitative technique to study protein–DNA interactions of known DNA binding proteins.
DNA-Protein-Interaction - Enzyme-Linked ImmunoSorbant Assay (DPI-ELISA) allows the qualitative and quantitative analysis of DNA-binding preferences of known proteins ''in vitro''. This technique allows the analysis of protein complexes that bind to DNA (DPI-Recruitment-ELISA) or is suited for automated screening of several nucleotide probes due to its standard ELISA plate formate.
DNase footprinting assay can be used to identify the specific sites of binding of a protein to DNA at basepair resolution.
Chromatin immunoprecipitation
Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genom ...
is used to identify the ''in vivo'' DNA target regions of a known transcription factor. This technique when combined with high throughput sequencing is known as
ChIP-Seq
ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with Massively parallel signature sequencing, massively parallel DNA sequencing to identify t ...
and when combined with
microarray
A microarray is a multiplex (assay), multiplex lab-on-a-chip. Its purpose is to simultaneously detect the expression of thousands of biological interactions. It is a two-dimensional array on a Substrate (materials science), solid substrate—usu ...
s it is known as
ChIP-chip.
Yeast one-hybrid System (Y1H) is used to identify which protein binds to a particular DNA fragment.
Bacterial one-hybrid system (B1H) is used to identify which protein binds to a particular DNA fragment. Structure determination using
X-ray crystallography
X-ray crystallography is the experimental science of determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to Diffraction, diffract in specific directions. By measuring th ...
has been used to give a highly detailed atomic view of protein–DNA interactions.
Besides these methods, other techniques such as SELEX, PBM (protein binding microarrays), DNA microarray screens, DamID, FAIRE or more recently DAP-seq are used in the laboratory to investigate DNA-protein interaction ''in vivo'' and ''in vitro''.
Manipulating the interactions
The protein–DNA interactions can be modulated using stimuli like ionic strength of the buffer, macromolecular crowding,
temperature, pH and electric field. This can lead to reversible dissociation/association of the protein–DNA complex.
See also
*
bZIP domain
The Basic Leucine Zipper Domain (bZIP domain) is found in many DNA binding eukaryotic proteins. One part of the domain contains a region that mediates sequence specific DNA binding properties and the leucine zipper that is required to hold toge ...
*
ChIP-exo
*
Comparison of nucleic acid simulation software
*
DNA-binding domain
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a gener ...
*
Helix-loop-helix
*
Helix-turn-helix
Helix-turn-helix is a DNA-binding domain (DBD). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the majo ...
*
HMG-box
*
Leucine zipper
A leucine zipper (or leucine scissors) is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amin ...
*
Lexitropsin (a semi-synthetic DNA-binding ligand)
*
Deoxyribonucleoprotein
*
Protein–DNA interaction site prediction software
*
RNA-binding protein
RNA-binding proteins (often abbreviated as RBPs) are proteins that bind to the double or single stranded RNA in cell (biology), cells and participate in forming ribonucleoprotein complexes.
RBPs contain various structural motifs, such as RNA reco ...
*
Single-strand binding protein
*
Zinc finger
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a ...
References
External links
Protein-DNA binding: data, tools & models (annotated list, constantly updated)tool for modeling DNA-ligand interactions.
DBD database of predicted transcription factorsUses a curated set of DNA-binding domains to predict transcription factors in all completely sequenced genomes
*
{{DEFAULTSORT:Dna-Binding Protein
DNA-binding proteins
Molecular genetics
DNA replication
Transcription factors
Biophysics