HOME

TheInfoList



OR:

In
molecular biology Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and phys ...
, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the desired
cells Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery w ...
at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct
cell division Cell division is the process by which a parent cell divides into two daughter cells. Cell division usually occurs as part of a larger cell cycle in which the cell grows and replicates its chromosome(s) before dividing. In eukaryotes, there ...
, cell growth, and cell death throughout life;
cell migration Cell migration is a central process in the development and maintenance of multicellular organisms. Tissue formation during embryonic development, wound healing and immune responses all require the orchestrated movement of cells in particular dir ...
and organization ( body plan) during embryonic development; and intermittently in response to signals from outside the cell, such as a
hormone A hormone (from the Greek participle , "setting in motion") is a class of signaling molecules in multicellular organisms that are sent to distant organs by complex biological processes to regulate physiology and behavior. Hormones are required ...
. There are up to 1600 TFs in the human genome. Transcription factors are members of the proteome as well as regulome. TFs work alone or with other proteins in a complex, by promoting (as an activator), or blocking (as a repressor) the recruitment of RNA polymerase (the enzyme that performs the transcription of genetic information from DNA to RNA) to specific genes. A defining feature of TFs is that they contain at least one DNA-binding domain (DBD), which attaches to a specific sequence of DNA adjacent to the genes that they regulate. TFs are grouped into classes based on their DBDs. Other proteins such as coactivators, chromatin remodelers,
histone acetyltransferase Histone acetyltransferases (HATs) are enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-''N''-acetyllysine. DNA is wrapped around histones, and, by transferring an ...
s, histone deacetylases, kinases, and methylases are also essential to gene regulation, but lack DNA-binding domains, and therefore are not TFs. TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them.


Number

Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene. There are approximately 2800 proteins in the human genome that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors, though other studies indicate it to be a smaller number. Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example, hepatocyte nuclear factors). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during development.


Mechanism

Transcription factors bind to either enhancer or promoter regions of DNA adjacent to the genes that they regulate. Depending on the transcription factor, the transcription of the adjacent gene is either up- or down-regulated. Transcription factors use a variety of mechanisms for the regulation of gene expression. These mechanisms include: * stabilize or block the binding of RNA polymerase to DNA * catalyze the acetylation or deacetylation of
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
proteins. The transcription factor can either do this directly or recruit other proteins with this catalytic activity. Many transcription factors use one or the other of two opposing mechanisms to regulate transcription: **
histone acetyltransferase Histone acetyltransferases (HATs) are enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-''N''-acetyllysine. DNA is wrapped around histones, and, by transferring an ...
(HAT) activity – acetylates
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
proteins, which weakens the association of DNA with
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
s, which make the DNA more accessible to transcription, thereby up-regulating transcription ** histone deacetylase (HDAC) activity – deacetylates
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
proteins, which strengthens the association of DNA with histones, which make the DNA less accessible to transcription, thereby down-regulating transcription * recruit coactivator or corepressor proteins to the transcription factor DNA complex


Function

Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:


Basal transcriptional regulation

In eukaryotes, an important class of transcription factors called general transcription factors (GTFs) are necessary for transcription to occur. Many of these GTFs do not actually bind DNA, but rather are part of the large transcription preinitiation complex that interacts with RNA polymerase directly. The most common GTFs are TFIIA, TFIIB, TFIID (see also
TATA binding protein The TATA-binding protein (TBP) is a general transcription factor that binds specifically to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene pr ...
), TFIIE, TFIIF, and TFIIH. The preinitiation complex binds to promoter regions of DNA upstream to the gene that they regulate.


Differential enhancement of transcription

Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.


Development

Many transcription factors in multicellular organisms are involved in development. Responding to stimuli, these transcription factors turn on/off the transcription of the appropriate genes, which, in turn, allows for changes in cell morphology or activities needed for cell fate determination and
cellular differentiation Cellular differentiation is the process in which a stem cell alters from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellular ...
. The Hox transcription factor family, for example, is important for proper body pattern formation in organisms as diverse as fruit flies to humans. Another example is the transcription factor encoded by the sex-determining region Y (SRY) gene, which plays a major role in determining sex in humans.


Response to intercellular signals

Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade.
Estrogen Estrogen or oestrogen is a category of sex hormone responsible for the development and regulation of the female reproductive system and secondary sex characteristics. There are three major endogenous estrogens that have estrogenic hormonal ac ...
signaling is an example of a fairly short signaling cascade that involves the estrogen receptor transcription factor: Estrogen is secreted by tissues such as the
ovaries The ovary is an organ in the female reproductive system that produces an ovum. When released, this travels down the fallopian tube into the uterus, where it may become fertilized by a sperm. There is an ovary () found on each side of the body. T ...
and
placenta The placenta is a temporary embryonic and later fetal organ that begins developing from the blastocyst shortly after implantation. It plays critical roles in facilitating nutrient, gas and waste exchange between the physically separate mate ...
, crosses the
cell membrane The cell membrane (also known as the plasma membrane (PM) or cytoplasmic membrane, and historically referred to as the plasmalemma) is a biological membrane that separates and protects the interior of all cells from the outside environment (t ...
of the recipient cell, and is bound by the estrogen receptor in the cell's
cytoplasm In cell biology, the cytoplasm is all of the material within a eukaryotic cell, enclosed by the cell membrane, except for the cell nucleus. The material inside the nucleus and contained within the nuclear membrane is termed the nucleoplasm. ...
. The estrogen receptor then goes to the cell's
nucleus Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to: * Atomic nucleus, the very dense central region of an atom *Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA Nucl ...
and binds to its DNA-binding sites, changing the transcriptional regulation of the associated genes.


Response to environment

Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include
heat shock factor In molecular biology, heat shock factors (HSF), are the transcription factors that regulate the expression of the heat shock proteins. A typical example is the heat shock factor of ''Drosophila melanogaster''. Function Heat shock factors (H ...
(HSF), which upregulates genes necessary for survival at higher temperatures,
hypoxia inducible factor Hypoxia-inducible factors (HIFs) are transcription factors that respond to decreases in available oxygen in the cellular environment, or hypoxia. They are only present in parahoxozoan animals. Discovery The HIF transcriptional complex ...
(HIF), which upregulates genes necessary for cell survival in low-oxygen environments, and sterol regulatory element binding protein (SREBP), which helps maintain proper
lipid Lipids are a broad group of naturally-occurring molecules which includes fats, waxes, sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides, diglycerides, phospholipids, and others. The functions of lipids in ...
levels in the cell.


Cell cycle control

Many transcription factors, especially some that are proto-oncogenes or tumor suppressors, help regulate the
cell cycle The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and sub ...
and as such determine how large a cell will get and when it can divide into two daughter cells. One example is the Myc oncogene, which has important roles in cell growth and
apoptosis Apoptosis (from grc, ἀπόπτωσις, apóptōsis, 'falling off') is a form of programmed cell death that occurs in multicellular organisms. Biochemical events lead to characteristic cell changes ( morphology) and death. These changes in ...
.


Pathogenesis

Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (
TAL effector TAL (transcription activator-like) effectors (often referred to as TALEs, but not to be confused with the three amino acid loop extension homeobox class of proteins) are proteins secreted by some β- and γ-proteobacteria. Most of these are Xa ...
s) secreted by Xanthomonas bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. TAL effectors contain a central repeat region in which there is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector's target site. This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell.


Regulation

It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:


Synthesis

Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a negative feedback loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell.


Nuclear localization

In eukaryotes, transcription factors (like most proteins) are transcribed in the
nucleus Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to: * Atomic nucleus, the very dense central region of an atom *Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA Nucl ...
but are then translated in the cell's
cytoplasm In cell biology, the cytoplasm is all of the material within a eukaryotic cell, enclosed by the cell membrane, except for the cell nucleus. The material inside the nucleus and contained within the nuclear membrane is termed the nucleoplasm. ...
. Many proteins that are active in the nucleus contain nuclear localization signals that direct them to the nucleus. But, for many transcription factors, this is a key point in their regulation. Important classes of transcription factors such as some nuclear receptors must first bind a
ligand In coordination chemistry, a ligand is an ion or molecule (functional group) that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's elect ...
while in the cytoplasm before they can relocate to the nucleus.


Activation

Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including: *
ligand In coordination chemistry, a ligand is an ion or molecule (functional group) that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's elect ...
binding – Not only is ligand binding able to influence where a transcription factor is located within a cell but ligand binding can also affect whether the transcription factor is in an active state and capable of binding DNA or other cofactors (see, for example, nuclear receptors). * phosphorylation – Many transcription factors such as STAT proteins must be phosphorylated before they can bind DNA. * interaction with other transcription factors (''e.g.'', homo- or hetero-
dimerization A dimer () ('' di-'', "two" + ''-mer'', "parts") is an oligomer consisting of two monomers joined by bonds that can be either strong or weak, covalent or intermolecular. Dimers also have significant implications in polymer chemistry, inorganic che ...
) or coregulatory proteins


Accessibility of DNA-binding site

In eukaryotes, DNA is organized with the help of
histone In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn a ...
s into compact particles called nucleosomes, where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as chromatin remodelers. Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
.


Availability of other cofactors/transcription factors

Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as
cofactors Cofactor may also refer to: * Cofactor (biochemistry), a substance that needs to be present in addition to an enzyme for a certain reaction to be catalysed * A domain parameter in elliptic curve cryptography, defined as the ratio between the order ...
that allow efficient recruitment of the preinitiation complex and RNA polymerase. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with NF-κB, which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues.


Interaction with methylated cytosine

Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine is followed by guanine in the 5' to 3' DNA sequence, a CpG site.) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription, while methylation of CpGs in the body of a gene increases expression. TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene. The DNA binding sites of 519 transcription factors were evaluated. Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained a methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located. TET enzymes do not specifically bind to methylcytosine except when recruited (see
DNA demethylation For molecular biology in mammals, DNA demethylation causes replacement of 5-methylcytosine (5mC) in a DNA sequence by cytosine (C) (see figure of 5mC and C). DNA demethylation can occur by an active process at the site of a 5mC in a DNA seque ...
). Multiple transcription factors important in cell differentiation and lineage specification, including NANOG, SALL4A,
WT1 Wilms tumor protein (WT33) is a protein that in humans is encoded by the ''WT1'' gene on chromosome 11p. Function This gene encodes a transcription factor that contains four zinc finger motifs at the C-terminus and a proline / glutamine-rich ...
, EBF1,
PU.1 Transcription factor PU.1 is a protein that in humans is encoded by the ''SPI1'' gene. Function This gene encodes an ETS-domain transcription factor that activates gene expression during myeloid and B-lymphoid cell development. The nuclear pr ...
, and E2A, have been shown to recruit TET enzymes to specific genomic loci (primarily enhancers) to act on methylcytosine (mC) and convert it to hydroxymethylcytosine hmC (and in most cases marking them for subsequent complete demethylation to cytosine). TET-mediated conversion of mC to hmC appears to disrupt the binding of 5mC-binding proteins including MECP2 and MBD (
Methyl-CpG-binding domain The Methyl-CpG-binding domain (MBD) in molecular biology binds to DNA that contains one or more symmetrically methylated CpGs. MBD has negligible non-specific affinity for unmethylated DNA. In vitro foot-printing with the chromosomal protein MeC ...
) proteins, facilitating nucleosome remodeling and the binding of transcription factors, thereby activating transcription of those genes. EGR1 is an important transcription factor in
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remember ...
formation. It has an essential role in
brain A brain is an organ (biology), organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It is located in the head, usually close to the sensory organs for senses such as Visual perception, vision. I ...
neuron A neuron, neurone, or nerve cell is an electrically excitable cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous tissue in all animals except sponges and placozoa ...
epigenetic In biology, epigenetics is the study of stable phenotypic changes (known as ''marks'') that do not involve alterations in the DNA sequence. The Greek prefix '' epi-'' ( "over, outside of, around") in ''epigenetics'' implies features that are ...
reprogramming. The transcription factor EGR1 recruits the TET1 protein that initiates a pathway of
DNA demethylation For molecular biology in mammals, DNA demethylation causes replacement of 5-methylcytosine (5mC) in a DNA sequence by cytosine (C) (see figure of 5mC and C). DNA demethylation can occur by an active process at the site of a 5mC in a DNA seque ...
. EGR1, together with TET1, is employed in programming the distribution of methylation sites on brain DNA during brain development and in learning (see Epigenetics in learning and memory).


Structure

Transcription factors are modular in structure and contain the following domains: * DNA-binding domain (DBD), which attaches to specific sequences of DNA ( enhancer or promoter. Necessary component for all vectors. Used to drive transcription of the vector's transgene promoter sequences) adjacent to regulated genes. DNA sequences that bind transcription factors are often referred to as response elements. * Activation domain (AD), which contains binding sites for other proteins such as transcription coregulators. These binding sites are frequently referred to as activation functions (AFs), Transactivation domain (TAD) or Trans-activating domain TAD but not mix with topologically associating domain TAD. * An optional signal-sensing domain (SSD) (''e.g.'', a ligand-binding domain), which senses external signals and, in response, transmits these signals to the rest of the transcription complex, resulting in up- or down-regulation of gene expression. Also, the DBD and signal-sensing domains may reside on separate proteins that associate within the transcription complex to regulate gene expression.


DNA-binding domain

The portion (
domain Domain may refer to: Mathematics *Domain of a function, the set of input values for which the (total) function is defined ** Domain of definition of a partial function ** Natural domain of a partial function **Domain of holomorphy of a function * ...
) of the transcription factor that binds DNA is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors:


Response elements

The DNA sequence that a transcription factor binds to is called a transcription factor-binding site or response element. Transcription factors interact with their binding sites using a combination of
electrostatic Electrostatics is a branch of physics that studies electric charges at rest ( static electricity). Since classical times, it has been known that some materials, such as amber, attract lightweight particles after rubbing. The Greek word for ...
(of which hydrogen bonds are a special case) and
Van der Waals force In molecular physics, the van der Waals force is a distance-dependent interaction between atoms or molecules. Unlike ionic or covalent bonds, these attractions do not result from a chemical electronic bond; they are comparatively weak and ...
s. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction. For example, although the consensus binding site for the
TATA-binding protein The TATA-binding protein (TBP) is a general transcription factor that binds specifically to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gen ...
(TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA. Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ...
of the cell. Other constraints, such as DNA accessibility in the cell or availability of
cofactors Cofactor may also refer to: * Cofactor (biochemistry), a substance that needs to be present in addition to an enzyme for a certain reaction to be catalysed * A domain parameter in elliptic curve cryptography, defined as the ratio between the order ...
may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence, it is still difficult to predict where a transcription factor will actually bind in a living cell. Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.


Clinical significance

Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.


Disorders

Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, m ...
s in transcription factors. Many transcription factors are either tumor suppressors or oncogenes, and, thus, mutations or aberrant regulation of them is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the NF-kappaB and AP-1 families, (2) the STAT family and (3) the
steroid receptors Steroid hormone receptors are found in the nucleus, cytosol, and also on the plasma membrane of target cells. They are generally intracellular receptors (typically cytoplasmic or nuclear) and initiate signal transduction for steroid hormones whi ...
. Below are a few of the better-studied examples:


Potential drug targets

Approximately 10% of currently prescribed drugs directly target the nuclear receptor class of transcription factors. Examples include tamoxifen and bicalutamide for the treatment of
breast The breast is one of two prominences located on the upper ventral region of a primate's torso. Both females and males develop breasts from the same embryological tissues. In females, it serves as the mammary gland, which produces and sec ...
and prostate cancer, respectively, and various types of anti-inflammatory and
anabolic Anabolism () is the set of metabolic pathways that construct molecules from smaller units. These reactions require energy, known also as an endergonic process. Anabolism is the building-up aspect of metabolism, whereas catabolism is the breakin ...
steroid A steroid is a biologically active organic compound with four rings arranged in a specific molecular configuration. Steroids have two principal biological functions: as important components of cell membranes that alter membrane fluidity; and ...
s. In addition, transcription factors are often indirectly modulated by drugs through
signaling cascade A biochemical cascade, also known as a signaling cascade or signaling pathway, is a series of chemical reactions that occur within a biological cell when initiated by a stimulus. This stimulus, known as a first messenger, acts on a receptor tha ...
s. It might be possible to directly target other less-explored transcription factors such as NF-κB with drugs. Transcription factors outside the nuclear receptor family are thought to be more difficult to target with small molecule therapeutics since it is not clear that they are "drugable" but progress has been made on Pax2 and the notch pathway. *


Role in evolution

Gene duplications have played a crucial role in the
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of the DNA binding specificities of the single-copy
Leafy LEAFY (abbreviated LFY) is a plant gene that causes groups of undifferentiated cells called meristems to develop into flowers instead of leaves with associated shoots. ''LEAFY'' is involved in floral meristem identity. ''LEAFY'' encodes a plant ...
transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative
phylogenetic In biology, phylogenetics (; from Greek φυλή/ φῦλον [] "tribe, clan, race", and wikt:γενετικός, γενετικός [] "origin, source, birth") is the study of the evolutionary history and relationships among or within groups ...
hypotheses, and the role of transcription factors in the evolution of all species.


Role in biocontrol activity

The transcription factors have a role in resistance activity which important for successful biocontrol activity. The resistant to oxidative stress and alkaline pH sensing were contributed from the transcription factor Yap1 and Rim101 of the '' Papiliotrema terrestris'' LS28 as molecular tools revealed an understanding of the genetic mechanisms underlying the biocontrol activity which will supports disease management programs based on biological and integrated control.


Analysis

There are different technologies available to analyze transcription factors. On the
genomic Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dim ...
level, DNA- sequencing and database research are commonly used. The protein version of the transcription factor is detectable by using specific
antibodies An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of ...
. The sample is detected on a western blot. By using electrophoretic mobility shift assay (EMSA), the activation profile of transcription factors can be detected. A
multiplex Multiplex may refer to: * Multiplex (automobile), a former American car make * Multiplex (comics), a DC comic book supervillain * Multiplex (company), a global contracting and development company * Multiplex (assay), a biological assay which measu ...
approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel. The most commonly used method for identifying transcription factor binding sites is chromatin immunoprecipitation (ChIP). This technique relies on chemical fixation of chromatin with
formaldehyde Formaldehyde ( , ) ( systematic name methanal) is a naturally occurring organic compound with the formula and structure . The pure compound is a pungent, colourless gas that polymerises spontaneously into paraformaldehyde (refer to section ...
, followed by co-precipitation of DNA and the transcription factor of interest using an
antibody An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of t ...
that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing ( ChIP-seq) to determine transcription factor binding sites. If no antibody is available for the protein of interest,
DamID DNA adenine methyltransferase identification, often abbreviated DamID, is a molecular biology protocol used to map the binding sites of DNA- and chromatin-binding proteins in eukaryotes. DamID identifies binding sites by expressing the proposed D ...
may be a convenient alternative.


Classes

As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains.


Mechanistic

There are two mechanistic classes of transcription factors: * General transcription factors are involved in the formation of a preinitiation complex. The most common are abbreviated as TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH. They are ubiquitous and interact with the core promoter region surrounding the transcription start site(s) of all
class II gene A class II gene is a type of gene that codes for a protein. Class II genes are transcribed by RNAP II . Class II genes have a promoter that may contain a TATA box. Basal transcription of class II genes requires the formation of a preinitiat ...
s. *Upstream transcription factors are proteins that bind somewhere upstream of the initiation site to stimulate or repress transcription. These are roughly synonymous with specific transcription factors, because they vary considerably depending on what recognition sequences are present in the proximity of the gene.


Functional

Transcription factors have been classified according to their regulatory function: * I. constitutively active – present in all cells at all times – general transcription factors, Sp1, NF1, CCAAT * II. conditionally active – requires activation ** II.A developmental (cell specific) – expression is tightly controlled, but, once expressed, require no additional activation – GATA, HNF,
PIT-1 POU domain, class 1, transcription factor 1 (Pit1, growth hormone factor 1), also known as POU1F1, is a transcription factor for growth hormone. Function PIT1 is a pituitary-specific transcription factor responsible for pituitary development a ...
, MyoD, Myf5, Hox, Winged Helix ** II.B signal-dependent – requires external signal for activation *** II.B.1 extracellular ligand ( endocrine or
paracrine Paracrine signaling is a form of cell signaling, a type of cellular communication in which a cell produces a signal to induce changes in nearby cells, altering the behaviour of those cells. Signaling molecules known as paracrine factors diffuse over ...
)-dependent – nuclear receptors *** II.B.2 intracellular ligand (
autocrine Autocrine signaling is a form of cell signaling in which a cell secretes a hormone or chemical messenger (called the autocrine agent) that binds to autocrine receptors on that same cell, leading to changes in the cell. This can be contrasted with p ...
)-dependent – activated by small intracellular molecules – SREBP, p53, orphan nuclear receptors *** II.B.3 cell membrane receptor-dependent – second messenger signaling cascades resulting in the phosphorylation of the transcription factor **** II.B.3.a resident nuclear factors – reside in the nucleus regardless of activation state – CREB, AP-1, Mef2 **** II.B.3.b latent cytoplasmic factors – inactive form reside in the cytoplasm, but, when activated, are translocated into the nucleus – STAT, R-SMAD, NF-κB, Notch, TUBBY, NFAT


Structural

Transcription factors are often classified based on the sequence similarity and hence the tertiary structure of their DNA-binding domains: *1 Superclass: Basic Domains **1.1 Class: Leucine zipper factors ( bZIP) ***1.1.1 Family: AP-1(-like) components; includes ( c-Fos/ c-Jun) ***1.1.2 Family: CREB ***1.1.3 Family:
C/EBP CCAAT-enhancer-binding proteins (or C/EBPs) is a family of transcription factors composed of six members, named from C/EBPα to C/EBPζ. They promote the expression of certain genes through interaction with their promoters. Once bound to DNA, C ...
-like factors ***1.1.4 Family: bZIP / PAR ***1.1.5 Family: Plant G-box binding factors ***1.1.6 Family: ZIP only **1.2 Class: Helix-loop-helix factors ( bHLH) ***1.2.1 Family: Ubiquitous (class A) factors ***1.2.2 Family: Myogenic transcription factors ( MyoD) ***1.2.3 Family: Achaete-Scute ***1.2.4 Family: Tal/Twist/Atonal/Hen **1.3 Class: Helix-loop-helix / leucine zipper factors ( bHLH-ZIP) ***1.3.1 Family: Ubiquitous bHLH-ZIP factors; includes USF ( USF1, USF2); SREBP ( SREBP) ***1.3.2 Family: Cell-cycle controlling factors; includes c-Myc **1.4 Class: NF-1 ***1.4.1 Family: NF-1 ( A, B, C, X) **1.5 Class: RF-X ***1.5.1 Family: RF-X ( 1, 2, 3, 4, 5, ANK) **1.6 Class: bHSH *2 Superclass: Zinc-coordinating DNA-binding domains **2.1 Class: Cys4 zinc finger of nuclear receptor type ***2.1.1 Family: Steroid hormone receptors ***2.1.2 Family:
Thyroid hormone receptor The thyroid hormone receptor (TR) is a type of nuclear receptor that is activated by binding thyroid hormone. TRs act as transcription factors, ultimately affecting the regulation of gene transcription and translation. These receptors also have ...
-like factors **2.2 Class: diverse Cys4 zinc fingers ***2.2.1 Family: GATA-Factors **2.3 Class: Cys2His2 zinc finger domain ***2.3.1 Family: Ubiquitous factors, includes TFIIIA, Sp1 ***2.3.2 Family: Developmental / cell cycle regulators; includes
Krüppel Krüppel is a gap gene in '' Drosophila melanogaster'', located on the 2R chromosome, which encodes a zinc finger C2H2 transcription factor. Gap genes work together to establish the anterior-posterior segment patterning of the insect through ...
***2.3.4 Family: Large factors with NF-6B-like binding properties **2.4 Class: Cys6 cysteine-zinc cluster **2.5 Class: Zinc fingers of alternating composition *3 Superclass: Helix-turn-helix **3.1 Class: Homeo domain ***3.1.1 Family: Homeo domain only; includes Ubx ***3.1.2 Family:
POU domain POU (pronounced 'pow') is a family of proteins that have well-conserved homeodomains. Etymology The acronym POU is derived from the names of three transcription factors: * the Pituitary-specific Pit-1 * the Octamer transcription factor prote ...
factors; includes Oct ***3.1.3 Family: Homeo domain with LIM region ***3.1.4 Family: homeo domain plus zinc finger motifs **3.2 Class: Paired box ***3.2.1 Family: Paired plus homeo domain ***3.2.2 Family: Paired domain only **3.3 Class: Fork head / winged helix ***3.3.1 Family: Developmental regulators; includes forkhead ***3.3.2 Family: Tissue-specific regulators ***3.3.3 Family: Cell-cycle controlling factors ***3.3.0 Family: Other regulators **3.4 Class:
Heat Shock Factor In molecular biology, heat shock factors (HSF), are the transcription factors that regulate the expression of the heat shock proteins. A typical example is the heat shock factor of ''Drosophila melanogaster''. Function Heat shock factors (H ...
s ***3.4.1 Family: HSF **3.5 Class: Tryptophan clusters ***3.5.1 Family: Myb ***3.5.2 Family: Ets-type ***3.5.3 Family: Interferon regulatory factors **3.6 Class: TEA ( transcriptional enhancer factor) domain ***3.6.1 Family: TEA ( TEAD1, TEAD2, TEAD3, TEAD4) *4 Superclass: beta-Scaffold Factors with Minor Groove Contacts **4.1 Class: RHR ( Rel homology region) ***4.1.1 Family: Rel/ ankyrin; NF-kappaB ***4.1.2 Family: ankyrin only ***4.1.3 Family: NFAT (Nuclear Factor of Activated T-cells) ( NFATC1, NFATC2, NFATC3) **4.2 Class: STAT ***4.2.1 Family: STAT **4.3 Class: p53 ***4.3.1 Family: p53 **4.4 Class: MADS box ***4.4.1 Family: Regulators of differentiation; includes ( Mef2) ***4.4.2 Family: Responders to external signals, SRF ( serum response factor) () ***4.4.3 Family: Metabolic regulators (ARG80) **4.5 Class: beta-Barrel alpha-helix transcription factors **4.6 Class:
TATA binding protein The TATA-binding protein (TBP) is a general transcription factor that binds specifically to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene pr ...
s ***4.6.1 Family: TBP **4.7 Class: HMG-box ***4.7.1 Family: SOX genes, SRY ***4.7.2 Family: TCF-1 ( TCF1) ***4.7.3 Family: HMG2-related, SSRP1 ***4.7.4 Family: UBF ***4.7.5 Family: MATA **4.8 Class: Heteromeric CCAAT factors ***4.8.1 Family: Heteromeric CCAAT factors **4.9 Class: Grainyhead ***4.9.1 Family: Grainyhead **4.10 Class:
Cold-shock domain In molecular biology, the cold-shock domain (CSD) is a protein domain of about 70 amino acids which has been found in prokaryotic and eukaryotic DNA-binding proteins. Part of this domain is highly similar to the RNP-1 RNA-binding motif. When '' ...
factors ***4.10.1 Family: csd **4.11 Class: Runt ***4.11.1 Family: Runt *0 Superclass: Other Transcription Factors **0.1 Class: Copper fist proteins **0.2 Class: HMGI(Y) (
HMGA1 High-mobility group protein HMG-I/HMG-Y is a protein that in humans is encoded by the ''HMGA1'' gene. Function This gene encodes a non-histone chromatin protein involved in many cellular processes, including regulation of inducible gene transc ...
) ***0.2.1 Family: HMGI(Y) **0.3 Class: Pocket domain **0.4 Class: E1A-like factors **0.5 Class: AP2/EREBP-related factors ***0.5.1 Family: AP2 ***0.5.2 Family: EREBP ***0.5.3 Superfamily: AP2/B3 ****0.5.3.1 Family: ARF ****0.5.3.2 Family: ABI ****0.5.3.3 Family: RAV


See also

*
Cdx protein family The Cdx protein family is a group of the transcription factor proteins which bind to DNA to regulate the expression of genes. In particular this family of proteins can regulate the Hox genes. They are regulators of embryonic development and hematop ...
* DNA-binding protein *
Inhibitor of DNA-binding protein Inhibitor of DNA-binding/differentiation proteins, also known as ID proteins comprise a family of proteins that heterodimerize with basic helix-loop-helix (bHLH) transcription factors to inhibit DNA binding of bHLH proteins. ID proteins also cont ...
* Mapper(2) * Nuclear receptor, a class of ligand activated transcription factors *
Open Regulatory Annotation Database The Open Regulatory Annotation Database (also known as ORegAnno) is designed to promote community-based curation of regulatory information. Specifically, the database contains information about regulatory regions, transcription factor binding site ...
* Phylogenetic footprinting * TRANSFAC database * YeTFaSCo


References


Further reading

* Carretero-Paulet, Lorenzo; Galstyan, Anahit; Roig-Villanova, Irma; Martínez-García, Jaime F.; Bilbao-Castro, Jose R. «Genome-Wide Classification and Evolutionary Analysis of the bHLH Family of Transcription Factors in Arabidopsis, Poplar, Rice, Moss, and Algae». ''Plant Physiology'', 153, 3, 2010-07, pàg. 1398–1412. DOI
10.1104/pp.110.153593
* *


External links

*
Transcription factor database

Plant Transcription Factor Database and Transcriptional Regulation Data and Analysis Platform
{{Authority control Gene expression Protein families DNA Biophysics Evolutionary developmental biology