Glis1 (Glis Family Zinc Finger 1) is gene encoding a
Krüppel
Krüppel is a gap gene in ''Drosophila melanogaster'', located on the 2R chromosome, which encodes a zinc finger C2H2 transcription factor. Gap genes work together to establish the anterior-posterior segment patterning of the insect through r ...
-like protein of the same name whose
locus
Locus (plural loci) is Latin for "place". It may refer to:
Entertainment
* Locus (comics), a Marvel Comics mutant villainess, a member of the Mutant Liberation Front
* ''Locus'' (magazine), science fiction and fantasy magazine
** ''Locus Award' ...
is found on Chromosome
1p32.3.
The gene is enriched in
unfertilised eggs and embryos at the one cell stage
[*] and it can be used to promote direct reprogramming of
somatic cells to
induced pluripotent stem cell
Induced pluripotent stem cells (also known as iPS cells or iPSCs) are a type of pluripotent stem cell that can be generated directly from a somatic cell. The iPSC technology was pioneered by Shinya Yamanaka's lab in Kyoto, Japan, who showed in ...
s, also known as iPS cells.
Glis1 is a highly promiscuous
transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
, regulating the expression of numerous genes, either positively or negatively. In organisms, Glis1 does not appear to have any directly important functions.
Mice whose Glis1 gene has been
removed have no noticeable change to their
phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological pr ...
.
Structure
Glis1 is an 84.3
kDa
The dalton or unified atomic mass unit (symbols: Da or u) is a non-SI unit of mass widely used in physics and chemistry. It is defined as of the mass of an unbound neutral atom of carbon-12 in its nuclear and electronic ground state and at re ...
proline rich protein composed of 789 amino acids.
No
crystal structure
In crystallography, crystal structure is a description of the ordered arrangement of atoms, ions or molecules in a crystalline material. Ordered structures occur from the intrinsic nature of the constituent particles to form symmetric patterns ...
has yet been determined for Glis1, however it is homologous to other proteins in many parts of its amino acid sequence whose structures have been solved.
Zinc finger domain
Glis1 uses a
Zinc finger domain comprising five tandem
Cys2His2 zinc finger motifs (meaning the zinc atom is coordinated by two
cysteine and two
histidine
Histidine (symbol His or H) is an essential amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated –NH3+ form under biological conditions), a carboxylic acid group (which is in the d ...
residues) to interact with target
DNA sequences to regulate
gene transcription
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called ...
. The domain interacts sequence specifically with the DNA, following the
major groove
Major (commandant in certain jurisdictions) is a military rank of commissioned officer status, with corresponding ranks existing in many military forces throughout the world. When used unhyphenated and in conjunction with no other indicator ...
along the
double helix
A double is a look-alike or doppelgänger; one person or being that resembles another.
Double, The Double or Dubble may also refer to:
Film and television
* Double (filmmaking), someone who substitutes for the credited actor of a character
* ...
. It has the
consensus sequence GACCACCCAC.
The individual zinc finger motifs are separated from one another by the
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha a ...
sequence(T/S)GEKP(Y/F)X,
where X can be any amino acid and (A/B) can be either A or B. This domain is homologous to the zinc finger domain found in Gli1 and so is thought to interact with DNA in the same way.
The
alpha helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ear ...
of the fourth and fifth zinc fingers are inserted into the major groove and make the most extensive contact of all the zinc fingers with the DNA.
Very few contact are made by the second and third fingers and the first finger does not contact the DNA at all.
The first finger does make numerous
protein-protein interactions with the second zinc finger, however.
Termini
Glis1 has an
activation domain at its
C-terminus and a repressive domain at its
N-terminus. The repressive domain is much stronger than the activation domain meaning transcription is weak. The activation domain of Glis1 is four times stronger in the presence of
CaM kinase IV. This may be due to a coactivator. A proline-rich region of the protein is also found towards the N-terminal. The protein's termini are fairly unusual, and have no strong sequence similarity other proteins.
Use in cell reprogramming
Glis1 can be used as one of the four factors used in reprogramming somatic cells to induced pluripotent stem cells.
The three transcription factors
Oct3/4
Oct-4 (octamer-binding transcription factor 4), also known as POU5F1 (POU domain, class 5, transcription factor 1), is a protein that in humans is encoded by the ''POU5F1'' gene. Oct-4 is a homeodomain transcription factor of the POU family. I ...
,
Sox2 and
Klf4 are essential for reprogramming but are extremely inefficient on their own, fully reprogramming roughly only 0.005% of the number of cells treated with the factors.
When Glis1 is introduced with these three factors, the efficiency of reprogramming is massively increased, producing many more fully reprogrammed cells. The transcription factor
c-Myc
''Myc'' is a family of regulator genes and proto-oncogenes that code for transcription factors. The ''Myc'' family consists of three related human genes: ''c-myc'' ( MYC), ''l-myc'' ( MYCL), and ''n-myc'' ( MYCN). ''c-myc'' (also sometimes re ...
can also be used as the fourth factor and was the original fourth factor used by
Shinya Yamanaka
is a Japanese stem cell researcher and a Nobel Prize laureate. He serves as the director of Center for iPS Cell (induced Pluripotent Stem Cell) Research and Application and a professor at the Institute for Frontier Medical Sciences at Kyo ...
who received the
2012 Nobel Prize in Physiology or Medicine for his work in the conversion of somatic cells to iPS cells.
Yamanaka's work allows a way of bypassing the
controversy surrounding stem cells.
Mechanism
Somatic cells are most often fully differentiated in order to perform a specific function, and therefore only express the genes required to perform their function. This means the genes that are required for differentiation to other types of cell are packaged within
chromatin
Chromatin is a complex of DNA and protein found in eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in r ...
structures, so that they are not expressed.
Glis1 reprograms cells by promoting multiple pro-reprogramming pathways.
These pathways are activated due to the up regulation of the transcription factors
N-Myc
N-myc proto-oncogene protein also known as N-Myc or basic helix-loop-helix protein 37 (bHLHe37), is a protein that in humans is encoded by the ''MYCN'' gene.
Function
The ''MYCN'' gene is a member of the MYC family of transcription factors an ...
,
Mycl1, c-Myc,
Nanog,
ESRRB,
FOXA2
Forkhead box protein A2 (FOXA2), also known as hepatocyte nuclear factor 3-beta (HNF-3B), is a transcription factor that plays an important role during development, in mature tissues and, when dysregulated or mutated, also in cancer.
Structure
...
,
GATA4
Transcription factor GATA-4 is a protein that in humans is encoded by the ''GATA4'' gene.
Function
This gene encodes a member of the GATA family of zinc finger transcription factors. Members of this family recognize the GATA motif which is pr ...
,
NKX2-5
Homeobox protein Nkx-2.5 is a protein that in humans is encoded by the ''NKX2-5'' gene.
Function
Homeobox-containing genes play critical roles in regulating tissue-specific gene expression essential for tissue differentiation, as well as dete ...
, as well as the other three factors used for reprogramming.
Glis1 also up-regulates expression of the protein
LIN28 which binds the
let-7 microRNA
MicroRNA (miRNA) are small, single-stranded, non-coding RNA molecules containing 21 to 23 nucleotides. Found in plants, animals and some viruses, miRNAs are involved in RNA silencing and post-transcriptional regulation of gene expression. mi ...
precursor
Precursor or Precursors may refer to:
* Precursor (religion), a forerunner, predecessor
** The Precursor, John the Baptist
Science and technology
* Precursor (bird), a hypothesized genus of fossil birds that was composed of fossilized parts of u ...
, preventing production of active let-7. Let-7 microRNAs reduce the expression of pro-reprogramming genes via
RNA interference
RNA interference (RNAi) is a biological process in which RNA molecules are involved in sequence-specific suppression of gene expression by double-stranded RNA, through translational or transcriptional repression. Historically, RNAi was known by ...
.
Glis1 is also able to directly associate with the other three reprogramming factors which may help their function.
The result of the various changes in gene expression is the conversion of
heterochromatin, which is very difficult to access, to
euchromatin
Euchromatin (also called "open chromatin") is a lightly packed form of chromatin ( DNA, RNA, and protein) that is enriched in genes, and is often (but not always) under active transcription. Euchromatin stands in contrast to heterochromatin, whi ...
, which can be easily accessed by transcriptional proteins and enzymes such as
RNA polymerase.
During reprogramming,
histones
In biology, histones are highly basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn ar ...
, which make up
nucleosomes
A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone proteins and resembles thread wrapped around a spool. The nucleosome is the fundamen ...
, the complexes used to package DNA, are generally
demethylated Demethylation is the chemical process resulting in the removal of a methyl group (CH3) from a molecule. A common way of demethylation is the replacement of a methyl group by a hydrogen atom, resulting in a net loss of one carbon and two hydrogen ato ...
and
acetylated
:
In organic chemistry, acetylation is an organic esterification reaction with acetic acid. It introduces an acetyl group into a chemical compound. Such compounds are termed ''acetate esters'' or simply ''acetates''. Deacetylation is the opposit ...
'unpacking' the DNA by neutralising the positive charge of the
lysine residues on the N-termini of histones.
Advantages over c-myc
Glis1 has a number of extremely important advantages over c-myc in cell reprogramming.
*No risk of cancer: Although c-myc enhances the efficiency of reprogramming, its major disadvantage is that it is a
proto-oncogene
An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels. meaning the iPS cells produced using c-myc are much more likely to become cancerous. This is an enormous obstacle between iPS cells and their use in medicine.
When Glis1 is used in cell reprogramming, there is no increased risk of
cancer development.
*Production of fewer 'bad' colonies: While c-myc promotes the
proliferation of reprogrammed cells, it also promotes the proliferation of 'bad' cells which have not reprogrammed properly and make up the vast majority of cells in a dish of treated cells. Glis1 actively suppresses the proliferation of cells that have not fully reprogrammed, making the selection and harvesting of the properly reprogrammed cells less laborious.
This is likely to be due to many of these 'bad' cells expressing Glis1 but not all four of the reprogramming factors. When expressed on its own, Glis1 inhibits proliferation.
*More efficient reprogramming: The use of Glis1 reportedly produces more fully reprogrammed iPS cells than c-myc. This is an important quality given the inefficiency of reprogramming.
Disadvantages
*Inhibition of Proliferation: Failure to stop Glis1 expression after reprogramming inhibits cell proliferation and ultimately leads to the death of the reprogrammed cell. Therefore, careful regulation of Glis1 expression is required.
This explains why Glis1 expression is switched off in
embryos after they have started to divide.
Roles in disease
Glis1 has been implicated to play a part in a number of diseases and disorders.
Psoriasis
Glis1 has been shown to be heavily up regulated in
psoriasis
Psoriasis is a long-lasting, noncontagious autoimmune disease characterized by raised areas of abnormal skin. These areas are red, pink, or purple, dry, itchy, and scaly. Psoriasis varies in severity from small, localized patches to complete ...
,
a disease which causes chronic inflammation of the skin. Normally, Glis1 is not expressed in the skin at all. However, during inflammation, it is expressed in the
spinous layer
The stratum spinosum (or spinous layer/prickle cell layer) is a layer of the epidermis found between the stratum granulosum and stratum basale. This layer is composed of polyhedral keratinocytes. These are joined with desmosomes. Their spiny (Lati ...
of the skin, the second layer from the bottom of four layers as a response to the inflammation. This is the last layer where the cells have
nuclei and thus the last layer where gene expression occurs. It is believed that the role of Glis1 in this disease is to promote
cell differentiation in the skin by changing the increasing the expression of multiple pro-differentation genes such as
IGFBP2
Insulin-like growth factor-binding protein 2 is a protein that in humans is encoded by the ''IGFBP2'' gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." mean ...
which inhibits proliferation and can also promote
apoptosis It also decreases the expression of
Jagged1, a ligand of
notch in the
notch signaling pathway and
Frizzled10, a receptor in the
wnt signaling pathway
The Wnt signaling pathways are a group of signal transduction pathways which begin with proteins that pass signals into a cell through cell surface receptors. The name Wnt is a portmanteau created from the names Wingless and Int-1. Wnt signaling ...
.
Late onset Parkinson's Disease
A certain allele of Glis1 which exists due to a
single nucleotide polymorphism, a change in a single nucleotide of the DNA sequence of the gene, has been implicated as a risk factor in the neurodegenerative disorder
Parkinson's disease
Parkinson's disease (PD), or simply Parkinson's, is a long-term degenerative disorder of the central nervous system that mainly affects the motor system. The symptoms usually emerge slowly, and as the disease worsens, non-motor symptoms becom ...
. The allele is linked to the late onset variety of Parkinson's, which is acquired in old age. The reason behind this link is not yet known.
References
{{Transcription factors, g2
Transcription factors