In
genetics
Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar wor ...
, a promoter is a sequence of
DNA to which proteins bind to initiate
transcription
Transcription refers to the process of converting sounds (voice, music etc.) into letters or musical notes, or producing a copy of something in another medium, including:
Genetics
* Transcription (biology), the copying of DNA into RNA, the fir ...
of a single
RNA
Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (
mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA is ...
), or can have a function in and of itself, such as
tRNA
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
or
rRNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosoma ...
. Promoters are located near the transcription start sites of genes,
upstream
Upstream may refer to:
* Upstream (bioprocess)
* ''Upstream'' (film), a 1927 film by John Ford
* Upstream (networking)
* ''Upstream'' (newspaper), a newspaper covering the oil and gas industry
* Upstream (petroleum industry)
* Upstream (software ...
on the DNA (towards the 5' region of the
sense strand
In genetics, a sense strand, or coding strand, is the segment within double-stranded DNA that carries the translatable code in the 5′ to 3′ direction, and which is complementary to the antisense strand of DNA, or template strand, which doe ...
).
Promoters can be about 100–1000
base pairs
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
long, the sequence of which is highly dependent on the gene and product of transcription, type or class of
RNA polymerase
In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template.
Using the enzyme helicase, RNAP locally opens the ...
recruited to the site, and species of organism.
Promoters control gene expression in bacteria and eukaryotes. RNA polymerase must attach to DNA near a gene for transcription to occur. Promoter DNA sequences provide an enzyme binding site. The -10 sequence is TATAAT. -35 sequences are conserved on average, but not in most promoters.
Artificial promoters with conserved -10 and -35 elements transcribe more slowly. All DNAs have "Closely spaced promoters" Divergent, tandem, and convergent orientations are possible. Two closely spaced promoters will likely interfere. Regulatory elements can be several kilobases away from the transcriptional start site in gene promoters (enhancers).
In eukaryotes, the transcriptional complex can bend DNA, allowing regulatory sequences to be placed far from the transcription site. The distal promoter is upstream of the gene and may contain additional regulatory elements with a weaker influence. RNA polymerase II (RNAP II) bound to the transcription start site promoter can start mRNA synthesis.
CpG islands, a TATA box, and TFIIB recognition elements can be found in promoter DNA. Weingarten-Gabbay et al. found these elements have small effects on gene expression. Figure 1 shows an enhancer looping around a gene's promoter. A connector protein dimer (e.g. CTCF or YY1) stabilizes the loop by anchoring one member on the enhancer and the other on the promoter. 47% of bidirectionally paired genes in the Gene Ontology database shared a functional category.
Hypermethylation downregulates both genes, while demethylation upregulates them. Non-coding RNAs are linked to mRNA promoter regions, according to research. Subgenomic promoters range from 24 to 100 nucleotides (Beet necrotic yellow vein virus). Gene expression depends on promoter binding. Unwanted gene changes can increase a cell's cancer risk.
MicroRNA promoters often contain CpG islands. DNA methylation forms 5-methylcytosines at the 5' pyrimidine ring of CpG cytosine residues. Some cancer genes are silenced by mutation, but most are silenced by DNA methylation. Others are regulated promoters. Selection may favor less energetic transcriptional binding.
Variations in promoters or transcription factors cause some diseases. Misunderstandings can result from using canonical sequence to describe a promoter.
Overview
For transcription to take place, the enzyme that synthesizes RNA, known as
RNA polymerase
In molecular biology, RNA polymerase (abbreviated RNAP or RNApol), or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template.
Using the enzyme helicase, RNAP locally opens the ...
, must attach to the DNA near a gene. Promoters contain specific DNA sequences such as
response elements Response elements are short sequences of DNA within a gene promoter or enhancer region that are able to bind specific transcription factors and regulate transcription of genes.
Under conditions of stress, a transcription activator protein binds ...
that provide a secure initial binding site for RNA polymerase and for proteins called
transcription factors
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The func ...
that recruit RNA polymerase. These transcription factors have specific
activator or
repressor
In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the ...
sequences of corresponding nucleotides that attach to specific promoters and regulate gene expression.
;In
bacteria
Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among ...
: The promoter is recognized by RNA polymerase and an associated
sigma factor
A sigma factor (σ factor or specificity factor) is a protein needed for initiation of transcription in bacteria. It is a bacterial transcription initiation factor that enables specific binding of RNA polymerase (RNAP) to gene promoters. It is ho ...
, which in turn are often brought to the promoter DNA by an activator protein's binding to its own
DNA binding site
DNA binding sites are a type of binding site found in DNA where other molecules may bind. DNA binding sites are distinct from other binding sites in that (1) they are part of a DNA sequence (e.g. a genome) and (2) they are bound by DNA-binding ...
nearby.
;In
eukaryotes
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
: The process is more complicated, and at least seven different factors are necessary for the binding of an
RNA polymerase II
RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryoti ...
to the promoter.
Promoters represent critical elements that can work in concert with other regulatory regions (
enhancers,
silencers, boundary elements/
insulators
Insulator may refer to:
* Insulator (electricity), a substance that resists electricity
** Pin insulator, a device that isolates a wire from a physical support such as a pin on a utility pole
** Strain insulator, a device that is designed to work ...
) to direct the level of transcription of a given gene.
A promoter is induced in response to changes in abundance or conformation of regulatory proteins in a cell, which enable activating transcription factors to recruit RNA polymerase.
Identification of relative location
As promoters are typically immediately adjacent to the gene in question, positions in the promoter are designated relative to the
transcriptional start site, where transcription of DNA begins for a particular gene (i.e., positions upstream are negative numbers counting back from -1, for example -100 is a position 100 base pairs upstream).
Relative location in the cell nucleus
In the cell nucleus, it seems that promoters are distributed preferentially at the edge of the chromosomal territories, likely for the co-expression of genes on different chromosomes.
Furthermore, in humans, promoters show certain structural features characteristic for each chromosome.
Elements
Bacterial
In
bacteria
Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one biological cell. They constitute a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria were among ...
, the promoter contains two short sequence elements approximately 10 (
Pribnow Box
The Pribnow box (also known as the Pribnow-Schaller box) is a sequence of ''TATAAT'' of six nucleotides (thymine, adenine, thymine, etc.) that is an essential part of a promoter site on DNA for transcription to occur in bacteria.
It is an ideal ...
) and 35 nucleotides ''upstream'' from the
transcription start site
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called ...
.
* The sequence at -10 (the -10 element) has the
consensus sequence
In molecular biology and bioinformatics, the consensus sequence (or canonical sequence) is the calculated order of most frequent residues, either nucleotide or amino acid, found at each position in a sequence alignment. It serves as a simplified r ...
TATAAT.
* The sequence at -35 (the -35 element) has the consensus sequence TTGACA.
* The above consensus sequences, while conserved on average, are not found intact in most promoters. On average, only 3 to 4 of the 6 base pairs in each consensus sequence are found in any given promoter. Few natural promoters have been identified to date that possess intact consensus sequences at both the -10 and -35; artificial promoters with complete conservation of the -10 and -35 elements have been found to transcribe at lower frequencies than those with a few mismatches with the consensus.
* The optimal spacing between the -35 and -10 sequences is 17 bp.
* Some promoters contain one or more upstream promoter element (UP element) subsites
(
consensus sequence
In molecular biology and bioinformatics, the consensus sequence (or canonical sequence) is the calculated order of most frequent residues, either nucleotide or amino acid, found at each position in a sequence alignment. It serves as a simplified r ...
5'-AAAAAARNR-3' when centered in the -42 region; consensus sequence 5'-AWWWWWTTTTT-3' when centered in the -52 region; W = A or T; R = A or G; N = any base).
The above promoter sequences are recognized only by RNA polymerase
holoenzyme
Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as product (ch ...
containing
sigma-70
A sigma factor (σ factor or specificity factor) is a protein needed for initiation of Transcription (biology), transcription in bacteria. It is a bacterial transcription initiation factor that enables specific binding of RNA polymerase (RNAP) to g ...
. RNA polymerase holoenzymes containing other sigma factors recognize different core promoter sequences.
<-- upstream downstream
-->
5'-XXXXXXXPPPPPPXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXXXX-3'
-35 -10 Gene to be transcribed
Probability of occurrence of each nucleotide
for -10 sequence
T A T A A T
77% 76% 60% 61% 56% 82%
for -35 sequence
T T G A C A
69% 79% 61% 56% 54% 54%
Bidirectional (prokaryotic)
Promoters can be very closely located in the DNA. Such “closely spaced promoters” have been observed in the DNAs of all life forms, from Humans
to prokaryotes
and are highly conserved.
Therefore, they may provide some (presently unknown) advantages.
These pairs of promoters can be positioned in divergent, tandem, and convergent directions. They can also be regulated by transcription factors and differ in various features, such as the nucleotide distance between them, the two promoter strengths, etc.
The most important aspect of two closely spaced promoters is that they will, most likely, interfere with each other. Several studies have explored this using both analytical and stochastic models.
There are also studies that measured gene expression in synthetic genes or from one to a few genes controlled by bidirectional promoters.
More recently, one study measured most genes controlled by tandem promoters in ''E. coli''.
In that study, it was measured and then modeled two main forms of interference. One is when an RNAP is on the downstream promoter, blocking the movement of RNAPs elongating from the upstream promoter. The other is when the two promoters are so close that when an RNAP sits on one of the promoters, it blocks any other RNAP from reaching the other promoter. These events are possible because the RNAP occupies several nucleotides when bound to the DNA, including in transcription start sites.
Similar events occur when the promoters are in divergent and convergent formations. The possible events also depend on the distance between them.
Eukaryotic
Eukaryotic
Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacte ...
promoters are diverse and can be difficult to characterize, however, recent studies show that they are divided in more than 10 classes.
Gene promoters are typically located upstream of the gene and can have regulatory elements several kilobases away from the transcriptional start site (enhancers). In eukaryotes, the transcriptional complex can cause the DNA to bend back on itself, which allows for placement of regulatory sequences far from the actual site of transcription. Eukaryotic RNA-polymerase-II-dependent promoters can contain a
TATA box
In molecular biology, the TATA box (also called the Goldberg–Hogness box) is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes. The bacterial homolog of the TATA box is called the Pribnow box which has ...
(
consensus sequence
In molecular biology and bioinformatics, the consensus sequence (or canonical sequence) is the calculated order of most frequent residues, either nucleotide or amino acid, found at each position in a sequence alignment. It serves as a simplified r ...
TATAAA), which is recognized by the
general transcription factor
General transcription factors (GTFs), also known as basal transcriptional factors, are a class of protein transcription factors that bind to specific sites (Promoter (genetics), promoter) on DNA to activate Transcription (genetics), transcription ...
TATA-binding protein
The TATA-binding protein (TBP) is a general transcription factor that binds specifically to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene pr ...
(TBP); and a
B recognition element
The B recognition element (BRE) is a DNA sequence found in the promoter region of most genes in eukaryotes and Archaea. The BRE is a cis-regulatory element that is found immediately near TATA box, and consists of 7 nucleotides. There are two sets ...
(BRE), which is recognized by the general transcription factor
TFIIB
Transcription factor II B (TFIIB) is a general transcription factor that is involved in the formation of the RNA polymerase II preinitiation complex (PIC) and aids in stimulating transcription initiation. TFIIB is localised to the nucleus and pro ...
.
The TATA element and BRE typically are located close to the transcriptional start site (typically within 30 to 40 base pairs).
Eukaryotic promoter regulatory sequences typically bind proteins called transcription factors that are involved in the formation of the transcriptional complex. An example is the
E-box An E-box (enhancer box) is a DNA response element found in some eukaryotes that acts as a protein-binding site and has been found to regulate gene expression in neurons, muscles, and other tissues. Its specific DNA sequence, CANNTG (where N can b ...
(sequence CACGTG), which binds transcription factors in the
basic helix-loop-helix
BASIC (Beginners' All-purpose Symbolic Instruction Code) is a family of general-purpose, high-level programming languages designed for ease of use. The original version was created by John G. Kemeny and Thomas E. Kurtz at Dartmouth College ...
(bHLH) family (e.g.
BMAL1-Clock,
cMyc). Some promoters that are targeted by multiple transcription factors might achieve a hyperactive state, leading to increased transcriptional activity.
* Core promoter – the minimal portion of the promoter required to properly initiate transcription
** Includes the transcription start site (TSS) and elements directly upstream
** A binding site for RNA polymerase
***
RNA polymerase I
RNA polymerase 1 (also known as Pol I) is, in higher eukaryotes, the polymerase that only transcribes ribosomal RNA (but not 5S rRNA, which is synthesized by RNA polymerase III), a type of RNA that accounts for over 50% of the total RNA synthesiz ...
: transcribes genes encoding 18S, 5.8S and 28S
ribosomal RNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal ...
s
***
RNA polymerase II
RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryoti ...
: transcribes genes encoding
messenger RNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the p ...
and certain
small nuclear RNA
Small nuclear RNA (snRNA) is a class of small RNA molecules that are found within the splicing speckles and Cajal bodies of the cell nucleus in eukaryotic cells. The length of an average snRNA is approximately 150 nucleotides. They are transcribe ...
s and
microRNA
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRN ...
***
RNA polymerase III
In eukaryote cells, RNA polymerase III (also called Pol III) is a protein that transcribes DNA to synthesize ribosomal 5S rRNA, tRNA and other small RNAs.
The genes transcribed by RNA Pol III fall in the category of "housekeeping" genes whose e ...
: transcribes genes encoding
transfer RNA
Transfer RNA (abbreviated tRNA and formerly referred to as sRNA, for soluble RNA) is an adaptor molecule composed of RNA, typically 76 to 90 nucleotides in length (in eukaryotes), that serves as the physical link between the mRNA and the amino ac ...
,
5s ribosomal RNAs and other small RNAs
** General transcription factor binding sites, e.g.
TATA box
In molecular biology, the TATA box (also called the Goldberg–Hogness box) is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes. The bacterial homolog of the TATA box is called the Pribnow box which has ...
,
B recognition element
The B recognition element (BRE) is a DNA sequence found in the promoter region of most genes in eukaryotes and Archaea. The BRE is a cis-regulatory element that is found immediately near TATA box, and consists of 7 nucleotides. There are two sets ...
.
** Many other elements/motifs may be present. There is no such thing as a set of "universal elements" found in every core promoter.
* Proximal promoter – the proximal sequence upstream of the gene that tends to contain primary regulatory elements
** Approximately 250 base pairs upstream of the start site
** Specific
transcription factor binding site
DNA binding sites are a type of binding site found in DNA where other molecules may bind. DNA binding sites are distinct from other binding sites in that (1) they are part of a DNA sequence (e.g. a genome) and (2) they are bound by DNA-binding ...
s
*
Distal promoter – the distal sequence upstream of the gene that may contain additional regulatory elements, often with a weaker influence than the proximal promoter
** Anything further upstream (but not an enhancer or other regulatory region whose influence is positional/orientation independent)
** Specific transcription factor binding sites
Mammalian promoters
Up-regulated expression of genes in mammals is initiated when signals are transmitted to the promoters associated with the genes. Promoter DNA sequences may include different elements such as
CpG islands
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
(present in about 70% of promoters), a
TATA box
In molecular biology, the TATA box (also called the Goldberg–Hogness box) is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes. The bacterial homolog of the TATA box is called the Pribnow box which has ...
(present in about 24% of promoters),
initiator (Inr) (present in about 49% of promoters), upstream and downstream TFIIB recognition elements (BREu and BREd) (present in about 22% of promoters), and downstream core promoter element (DPE) (present in about 12% of promoters).
The presence of multiple
methylated CpG sites in CpG islands of promoters causes stable silencing of genes.
However, experiments by Weingarten-Gabbay et al.
showed that the presence or absence of the other elements have relatively small effects on gene expression. Two sequences, the TATA box and Inr, caused small but significant increases in expression (45% and 28% increases, respectively). The BREu and the BREd elements significantly decreased expression by 35% and 20%, respectively, and the DPE element had no detected effect on expression.
Cis-regulatory modules that are localized in DNA regions distant from the promoters of genes can have very large effects on gene expression, with some genes undergoing up to 100-fold increased expression due to such a cis-regulatory module.
These cis-regulatory modules include
enhancers
In genetics, an enhancer is a short (50–1500 bp) region of DNA that can be bound by proteins ( activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcriptio ...
,
silencers,
insulators
Insulator may refer to:
* Insulator (electricity), a substance that resists electricity
** Pin insulator, a device that isolates a wire from a physical support such as a pin on a utility pole
** Strain insulator, a device that is designed to work ...
and tethering elements.
Among this constellation of elements, enhancers and their associated
transcription factors
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The func ...
have a leading role in the regulation of gene expression.
Enhancers
In genetics, an enhancer is a short (50–1500 bp) region of DNA that can be bound by proteins ( activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcriptio ...
are regions of the genome that are major gene-regulatory elements. Enhancers control cell-type-specific gene expression programs, most often by looping through long distances to come in physical proximity with the promoters of their target genes.
In a study of brain cortical neurons, 24,937 loops were found, bringing enhancers to promoters.
Multiple enhancers, each often at tens or hundred of thousands of nucleotides distant from their target genes, loop to their target gene promoters and coordinate with each other to control expression of their common target gene.
The schematic illustration in this section shows an enhancer looping around to come into close physical proximity with the promoter of a target gene. The loop is stabilized by a dimer of a connector protein (e.g. dimer of
CTCF
Transcriptional repressor CTCF also known as 11-zinc finger protein or CCCTC-binding factor is a transcription factor that in humans is encoded by the ''CTCF'' gene. CTCF is involved in many cellular processes, including transcriptional regulatio ...
or
YY1
YY1 (Yin Yang 1) is a transcriptional repressor protein in humans that is encoded by the YY1 gene.
Function
YY1 is a ubiquitously distributed transcription factor belonging to the GLI-Kruppel class of zinc finger proteins. The protein is invo ...
), with one member of the dimer anchored to its binding motif on the enhancer and the other member anchored to its binding motif on the promoter (represented by the red zigzags in the illustration).
Several cell function specific transcription factors (there are about 1,600 transcription factors in a human cell
) generally bind to specific motifs on an enhancer
and a small combination of these enhancer-bound transcription factors, when brought close to a promoter by a DNA loop, govern the level of transcription of the target gene.
Mediator (coactivator)
Mediator is a multiprotein complex that functions as a transcriptional coactivator in all eukaryotes. It was discovered in 1990 in the lab of Roger D. Kornberg, recipient of the 2006 Nobel Prize in Chemistry. Mediator complexes interact with tra ...
(a complex usually consisting of about 26 proteins in an interacting structure) communicates regulatory signals from enhancer DNA-bound transcription factors directly to the RNA polymerase II (pol II) enzyme bound to the promoter.
Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two eRNAs as illustrated in the Figure.
An inactive enhancer may be bound by an inactive transcription factor. Phosphorylation of the transcription factor may activate it and that activated transcription factor may then activate the enhancer to which it is bound (see small red star representing phosphorylation of transcription factor bound to enhancer in the illustration).
An activated enhancer begins transcription of its RNA before activating a promoter to initiate transcription of messenger RNA from its target gene.
Bidirectional (mammalian)
Bidirectional promoters are short (<1 kbp) intergenic regions of
DNA between the 5' ends of the
genes
In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
in a bidirectional gene pair.
A “bidirectional gene pair” refers to two adjacent genes coded on opposite strands, with their 5' ends oriented toward one another. The two genes are often functionally related, and modification of their shared promoter region allows them to be co-regulated and thus co-expressed. Bidirectional promoters are a common feature of
mammal
Mammals () are a group of vertebrate animals constituting the class Mammalia (), characterized by the presence of mammary glands which in females produce milk for feeding (nursing) their young, a neocortex (a region of the brain), fur or ...
ian
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
s. About 11% of human genes are bidirectionally paired.
Bidirectionally paired genes in the
Gene Ontology
The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and g ...
database shared at least one database-assigned functional category with their partners 47% of the time.
Microarray
A microarray is a multiplex lab-on-a-chip. Its purpose is to simultaneously detect the expression of thousands of genes from a sample (e.g. from a tissue). It is a two-dimensional array on a solid substrate—usually a glass slide or silicon t ...
analysis has shown bidirectionally paired genes to be co-expressed to a higher degree than random genes or neighboring unidirectional genes.
Although co-expression does not necessarily indicate co-regulation,
methylation
In the chemical sciences, methylation denotes the addition of a methyl group on a substrate, or the substitution of an atom (or group) by a methyl group. Methylation is a form of alkylation, with a methyl group replacing a hydrogen atom. These t ...
of bidirectional promoter regions has been shown to downregulate both genes, and demethylation to upregulate both genes.
There are exceptions to this, however. In some cases (about 11%), only one gene of a bidirectional pair is expressed.
In these cases, the promoter is implicated in suppression of the non-expressed gene. The mechanism behind this could be competition for the same polymerases, or
chromatin
Chromatin is a complex of DNA and protein found in eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in r ...
modification. Divergent transcription could shift
nucleosomes
A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone proteins and resembles thread wrapped around a spool. The nucleosome is the fundamen ...
to upregulate transcription of one gene, or remove bound transcription factors to downregulate transcription of one gene.
Some functional classes of genes are more likely to be bidirectionally paired than others. Genes implicated in DNA repair are five times more likely to be regulated by bidirectional promoters than by unidirectional promoters.
Chaperone proteins are three times more likely, and
mitochondrial gene
Mitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial D ...
s are more than twice as likely. Many basic
housekeeping
Housekeeping is the management and routine support activities of running an organised physical institution occupied or used by people, like a house, ship, hospital or factory, such as tidying, cleaning, cooking, routine maintenance, shopping, ...
and cellular metabolic genes are regulated by bidirectional promoters.
The overrepresentation of bidirectionally paired DNA repair genes associates these promoters with
cancer
Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body. These contrast with benign tumors, which do not spread. Possible signs and symptoms include a lump, abnormal b ...
. Forty-five percent of human
somatic
Somatic may refer to:
* Somatic (biology), referring to the cells of the body in contrast to the germ line cells
** Somatic cell, a non-gametic cell in a multicellular organism
* Somatic nervous system, the portion of the vertebrate nervous sys ...
oncogenes
An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels. seem to be regulated by bidirectional promoters – significantly more than non-cancer causing genes. Hypermethylation of the promoters between gene pairs
WNT9A/CD558500,
CTDSPL
CTD small phosphatase-like protein is an enzyme that in humans is encoded by the ''CTDSPL'' gene.
Interactions
CTDSPL has been shown to interact with SNAI1
Zinc finger protein SNAI1 (sometimes referred to as Snail) is a protein that in human ...
/BC040563, and
KCNK15
Potassium channel subfamily K member 15 is a protein that in humans is encoded by the ''KCNK15'' gene.
This gene encodes K2P15.1, one of the members of the superfamily of potassium channel proteins containing two pore-forming P domains. K2P15.1 ha ...
/BF195580 has been associated with tumors.
Certain sequence characteristics have been observed in bidirectional promoters, including a lack of
TATA box
In molecular biology, the TATA box (also called the Goldberg–Hogness box) is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes. The bacterial homolog of the TATA box is called the Pribnow box which has ...
es, an abundance of
CpG islands
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
, and a symmetry around the midpoint of dominant Cs and As on one side and Gs and Ts on the other. A motif with the consensus sequence of TCTCGCGAGA, also called the
CGCG element CGCG may refer to:
* Catalogue of Galaxies and of Clusters of Galaxies
The Catalogue of Galaxies and of Clusters of Galaxies (or CGCG) was compiled by Fritz Zwicky in 1961–68. It contains 29,418 galaxies and 9,134 galaxy clusters.
Gallery
Fil ...
, was recently shown to drive PolII-driven bidirectional transcription in CpG islands.
CCAAT boxes are common, as they are in many promoters that lack TATA boxes. In addition, the
motifs NRF-1,
GABPA
GA-binding protein alpha chain is a protein that in humans is encoded by the ''GABPA'' gene.
Function
This gene encodes one of three GA-binding protein transcription factor subunits which functions as a DNA-binding subunit. Since this subunit ...
,
YY1
YY1 (Yin Yang 1) is a transcriptional repressor protein in humans that is encoded by the YY1 gene.
Function
YY1 is a ubiquitously distributed transcription factor belonging to the GLI-Kruppel class of zinc finger proteins. The protein is invo ...
, and ACTACAnnTCCC are represented in bidirectional promoters at significantly higher rates than in unidirectional promoters. The absence of TATA boxes in bidirectional promoters suggests that TATA boxes play a role in determining the directionality of promoters, but counterexamples of bidirectional promoters do possess TATA boxes and unidirectional promoters without them indicates that they cannot be the only factor.
Although the term "bidirectional promoter" refers specifically to promoter regions of
mRNA
In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein.
mRNA is ...
-encoding genes,
luciferase
Luciferase is a generic term for the class of oxidative enzymes that produce bioluminescence, and is usually distinguished from a photoprotein. The name was first used by Raphaël Dubois who invented the words ''luciferin'' and ''luciferase'', ...
assays have shown that over half of human genes do not have a strong directional bias. Research suggests that
non-coding RNAs
A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. The DNA sequence from which a functional non-coding RNA is transcribed is often called an RNA gene. Abundant and functionally important types of non- ...
are frequently associated with the promoter regions of mRNA-encoding genes. It has been hypothesized that the recruitment and initiation of
RNA polymerase II
RNA polymerase II (RNAP II and Pol II) is a multiprotein complex that transcribes DNA into precursors of messenger RNA (mRNA) and most small nuclear RNA (snRNA) and microRNA. It is one of the three RNAP enzymes found in the nucleus of eukaryoti ...
usually begins bidirectionally, but divergent transcription is halted at a checkpoint later during elongation. Possible mechanisms behind this regulation include sequences in the promoter region, chromatin modification, and the spatial orientation of the DNA.
Subgenomic
A subgenomic promoter is a promoter added to a virus for a specific
heterologous
The term heterologous has several meanings in biology.
Gene expression
In cell biology and protein biochemistry, heterologous expression means that a protein is experimentally put into a cell that does not normally make (i.e., express) that ...
gene, resulting in the formation of mRNA for that gene alone. Many positive-sense
RNA virus
An RNA virus is a virusother than a retrovirusthat has ribonucleic acid (RNA) as its genetic material. The nucleic acid is usually single-stranded RNA ( ssRNA) but it may be double-stranded (dsRNA). Notable human diseases caused by RNA viruses ...
es produce these
subgenomic mRNA
Subgenomic mRNAs are essentially smaller sections of the original transcribed template strand.
3' to 5' DNA or RNA
During transcription, the original template strand is usually read from the 3' to the 5' end from beginning to end. Subgenomic m ...
s (sgRNA) as one of the common infection techniques used by these viruses and generally transcribe late viral genes. Subgenomic promoters range from 24 nucleotide (
Sindbis virus
''Sindbis virus'' (SINV) is a member of the ''Togaviridae'' family, in the ''Alphavirus'' genus. The virus was first isolated in 1952 in Cairo, Egypt. The virus is transmitted by mosquitoes (''Culex'' and Culiseta). SINV is linked to Pogosta di ...
) to over 100 nucleotides (
Beet necrotic yellow vein virus
Beet necrotic yellow vein virus (BNYVV) is a plant virus, transmitted by the plasmodiophorid '' Polymyxa betae.'' The BNYVV is a member of the genus '' Benyvirus'' and is responsible for rhizomania, a disease of sugar beet (Rhizo: root; Mania: ma ...
) and are usually found upstream of the transcription start.
Detection
A wide variety of algorithms have been developed to facilitate detection of promoters in genomic sequence, and promoter prediction is a common element of many
gene prediction
In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functiona ...
methods. A promoter region is located before the -35 and -10 Consensus sequences. The closer the promoter region is to the consensus sequences the more often transcription of that gene will take place. There is not a set pattern for promoter regions as there are for consensus sequences.
Evolutionary change
Changes in promoter sequences are critical in evolution as indicated by the relatively stable number of genes in many lineages. For instance, most vertebrates have roughly the same number of protein-coding genes (about 20,000) which are often highly conserved in sequence, hence much of evolutionary change must come from changes in gene expression.
De novo origin of promoters
Given the short sequences of most promoter elements, promoters can rapidly evolve from random sequences. For instance, in
''E. coli'', ~60% of random sequences can evolve expression levels comparable to the wild-type
lac promoter with only one mutation, and that ~10% of random sequences can serve as active promoters even without evolution.
Binding
The initiation of the transcription is a multistep sequential process that involves several mechanisms: promoter location, initial reversible binding of RNA polymerase, conformational changes in RNA polymerase, conformational changes in DNA, binding of nucleoside triphosphate (NTP) to the functional RNA polymerase-promoter complex, and nonproductive and productive initiation of RNA synthesis.
The promoter binding process is crucial in the understanding of the process of gene expression. Tuning synthetic genetic systems relies on precisely engineered synthetic promoters with known levels of transcription rates.
Location
Although RNA polymerase
holoenzyme
Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as product (ch ...
shows high affinity to non-specific sites of the DNA, this characteristic does not allow us to clarify the process of promoter location.
This process of promoter location has been attributed to the structure of the holoenzyme to DNA and sigma 4 to DNA complexes.
Diseases associated with aberrant function
Most diseases are heterogeneous in cause, meaning that one "disease" is often many different diseases at the molecular level, though symptoms exhibited and response to treatment may be identical. How diseases of different molecular origin respond to treatments is partially addressed in the discipline of
pharmacogenomics
Pharmacogenomics is the study of the role of the genome in drug response. Its name ('' pharmaco-'' + ''genomics'') reflects its combining of pharmacology and genomics. Pharmacogenomics analyzes how the genetic makeup of an individual affects the ...
.
Not listed here are the many kinds of cancers involving aberrant transcriptional regulation owing to creation of
chimeric gene
Chimeric genes (literally, made of parts from different sources) form through the combination of portions of two or more coding sequences to produce new genes. These mutations are distinct from fusion genes which merge whole gene sequences into ...
s through pathological
chromosomal translocation
In genetics, chromosome translocation is a phenomenon that results in unusual rearrangement of chromosomes. This includes balanced and unbalanced translocation, with two main types: reciprocal-, and Robertsonian translocation. Reciprocal translo ...
. Importantly, intervention in the number or structure of promoter-bound proteins is one key to treating a disease without affecting expression of unrelated genes sharing elements with the target gene. Some genes whose change is not desirable are capable of influencing the potential of a cell to become cancerous.
CpG islands in promoters
In humans, about 70% of promoters located near the transcription start site of a gene (proximal promoters) contain a
CpG island
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
.
CpG islands are generally 200 to 2000 base pairs long, have a C:G
base pair
A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
content >50%, and have regions of
DNA where a
cytosine
Cytosine () ( symbol C or Cyt) is one of the four nucleobases found in DNA and RNA, along with adenine, guanine, and thymine (uracil in RNA). It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached (an am ...
nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
is followed by a
guanine
Guanine () ( symbol G or Gua) is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine (uracil in RNA). In DNA, guanine is paired with cytosine. The guanine nucleoside is called ...
nucleotide and this occurs frequently in the linear
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is calle ...
of
bases along its
5' → 3' direction.
Distal promoters also frequently contain CpG islands, such as the promoter of the DNA repair gene ''
ERCC1
DNA excision repair protein ERCC-1 is a protein that in humans is encoded by the ''ERCC1'' gene. Together with ERCC4, ERCC1 forms the ERCC1-XPF enzyme complex that participates in DNA repair and DNA recombination.
Many aspects of these two gene ...
'', where the CpG island-containing promoter is located about 5,400 nucleotides upstream of the coding region of the ''ERCC1'' gene.
CpG islands also occur frequently in promoters for
functional noncoding RNAs such as
microRNA
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRN ...
s.
Methylation of CpG islands stably silences genes
In humans, DNA methylation occurs at the 5' position of the pyrimidine ring of the cytosine residues within
CpG site
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s to form
5-methylcytosines. The presence of multiple methylated CpG sites in CpG islands of promoters causes stable silencing of genes.
Silencing of a gene may be initiated by other mechanisms, but this is often followed by methylation of CpG sites in the promoter CpG island to cause the stable silencing of the gene.
Promoter CpG hyper/hypo-methylation in cancer
Generally, in progression to cancer, hundreds of genes are
silenced or activated. Although silencing of some genes in cancers occurs by mutation, a large proportion of carcinogenic gene silencing is a result of altered DNA methylation (see
DNA methylation in cancer DNA methylation in cancer plays a variety of roles, helping to change the healthy cells by regulation of gene expression to a cancer cells or a diseased cells disease pattern. One of the most widely studied DNA methylation dysregulation is the pro ...
). DNA methylation causing silencing in cancer typically occurs at multiple
CpG site
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s in the
CpG island
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s that are present in the promoters of protein coding genes.
Altered expressions of
microRNA
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRN ...
s also silence or activate many genes in progression to cancer (see
microRNAs in cancer). Altered microRNA expression occurs through
hyper/hypo-methylation of
CpG site
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s in
CpG island
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG isl ...
s in promoters controlling transcription of the
microRNA
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. miRN ...
s.
Silencing of DNA repair genes through methylation of CpG islands in their promoters appears to be especially important in progression to cancer (see
methylation of DNA repair genes in cancer).
Canonical sequences and wild-type
The usage of the term
canonical sequence to refer to a promoter is often problematic, and can lead to misunderstandings about promoter sequences. Canonical implies perfect, in some sense.
In the case of a transcription factor binding site, there may be a single sequence that binds the protein most strongly under specified cellular conditions. This might be called canonical.
However, natural selection may favor less energetic binding as a way of regulating transcriptional output. In this case, we may call the most common sequence in a population the wild-type sequence. It may not even be the most advantageous sequence to have under prevailing conditions.
Recent evidence also indicates that several genes (including the
proto-oncogene
An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels. c-myc
''Myc'' is a family of regulator genes and proto-oncogenes that code for transcription factors. The ''Myc'' family consists of three related human genes: ''c-myc'' (MYC), ''l-myc'' (MYCL), and ''n-myc'' (MYCN). ''c-myc'' (also sometimes referre ...
) have
G-quadruplex
In molecular biology, G-quadruplex Nucleic acid secondary structure, secondary structures (G4) are formed in nucleic acids by sequences that are rich in guanine. They are helical in shape and contain guanine tetrads that can form from one, two o ...
motifs as potential regulatory signals.
Synthetic promoter design and engineering
Promoters are important gene regulatory elements used in tuning
synthetically designed genetic circuits and
metabolic network
A metabolic network is the complete set of metabolic and physical processes that determine the physiological and biochemical properties of a cell. As such, these networks comprise the chemical reactions of metabolism, the metabolic pathways, as w ...
s. For example, to overexpress an important gene in a network, to yield higher production of target protein, synthetic biologists design promoters to upregulate its
expression
Expression may refer to:
Linguistics
* Expression (linguistics), a word, phrase, or sentence
* Fixed expression, a form of words with a specific meaning
* Idiom, a type of fixed expression
* Metaphorical expression, a particular word, phrase, o ...
. Automated
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s can be used to design neutral DNA or insulators that do not trigger gene expression of downstream sequences.
Diseases that may be associated with variations
Some cases of many genetic diseases are associated with variations in promoters or transcription factors.
Examples include:
*
Asthma
Asthma is a long-term inflammatory disease of the airways of the lungs. It is characterized by variable and recurring symptoms, reversible airflow obstruction, and easily triggered bronchospasms. Symptoms include episodes of wheezing, cou ...
*
Beta thalassemia
Beta thalassemias (β thalassemias) are a group of inherited blood disorders. They are forms of thalassemia caused by reduced or absent synthesis of the beta chains of hemoglobin that result in variable outcomes ranging from severe anemia to cli ...
*
Rubinstein-Taybi syndrome
Constitutive vs regulated
Some promoters are called constitutive as they are active in all circumstances in the cell, while others are
regulated
Regulation is the management of complex systems according to a set of rules and trends. In systems theory, these types of rules exist in various fields of biology and society, but the term has slightly different meanings according to context. Fo ...
, becoming active in the cell only in response to specific stimuli.
Use of the term
When referring to a promoter some authors actually mean promoter +
operator; i.e., the lac promoter is IPTG inducible, meaning that besides the lac promoter, the
lac operon
The ''lactose'' operon (''lac'' operon) is an operon required for the transport and metabolism of lactose in ''E. coli'' and many other enteric bacteria. Although glucose is the preferred carbon source for most bacteria, the ''lac'' operon allows ...
is also present. If the lac operator were not present the
IPTG would not have an inducible effect.
Another example is the
Tac-Promoter The Tac-Promoter (abbreviated as Ptac), or tac vector is a synthetically produced DNA promoter, produced from the combination of promoters from the trp and lac operons. It is commonly used for protein production in ''Escherichia coli''.
Two hybr ...
system (Ptac). Notice how tac is written as a tac promoter, while in fact tac is actually both a promoter and an operator.
See also
*
Activator (genetics)
A transcriptional activator is a protein (transcription factor) that increases transcription of a gene or set of genes. Activators are considered to have ''positive'' control over gene expression, as they function to promote gene transcription and, ...
*
Enhancer (genetics)
In genetics, an enhancer is a short (50–1500 bp) region of DNA that can be bound by proteins ( activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription ...
*
Glossary of gene expression terms
A glossary (from grc, γλῶσσα, ''glossa''; language, speech, wording) also known as a vocabulary or clavis, is an alphabetical list of terms in a particular domain of knowledge with the definitions for those terms. Traditionally, a glo ...
*
Operon
In genetics, an operon is a functioning unit of DNA containing a cluster of genes under the control of a single promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo splic ...
*
Regulation of gene expression
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are wide ...
*
Repressor
In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to the ...
*
Transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The fu ...
*
Promoter bashing
References
External links
ORegAnno – Open Regulatory Annotation DatabaseIdentifying a Protein Binding Sites on DNA molecule YouTube tutorial video
Pleiades Promoter Project– a research project with an aim to generate 160 fully characterized, human DNA promoters of less than 4 kb (MiniPromoters) to drive
gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
in defined brain regions of therapeutic interests.
ENCODE threads ExplorerRNA and chromatin modification patterns around promoters.
Nature (journal)
''Nature'' is a British weekly scientific journal founded and based in London, England. As a multidisciplinary publication, ''Nature'' features peer-reviewed research from a variety of academic disciplines, mainly in science and technology. ...
{{Authority control
Gene expression