C1orf112
   HOME

TheInfoList



OR:

Chromosome 1 open reading frame 112, is a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
that in humans is encoded by the C1orf112
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
, and is located at position 1q24.2. C1orf112 encodes for seventeen variants of
mRNA In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of Protein biosynthesis, synthesizing a protein. mRNA is ...
, fifteen of which are functional proteins. C1orf112 has a determined precursor molecular weight of 96.6 kDa and an isoelectric point of 5.62. C1orf112 has been experimentally determined to localize to the
mitochondria A mitochondrion (; ) is an organelle found in the Cell (biology), cells of most Eukaryotes, such as animals, plants and Fungus, fungi. Mitochondria have a double lipid bilayer, membrane structure and use aerobic respiration to generate adenosi ...
, although it does not contain a
mitochondrial targeting sequence A mitochondrion (; ) is an organelle found in the cells of most Eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is use ...
.


Gene

The gene spans 192,073 base pairs, with 29 different exons. C1orf112 is located at position 1q24.2. C1orf112 shares
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context, ...
coding regions with C1orf156 and SCYL3.


Protein

There are currently eight experimentally determined RefSeq
isoforms A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene or gene family and are the result of genetic differences. While many perform the same or similar biological roles, some isof ...
. C1orf112 has a domain of unknown function DUF4487.


Composition

Compositional analysis through SAPS predicted much less
glycine Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid (carbamic acid is unstable), with the chemical formula NH2‐ CH2‐ COOH. Glycine is one of the proteinogeni ...
and much more
leucine Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α- ca ...
than expected relative to other human protein sequences. This characteristic is conserved across primate orthologs. A mixed charge cluster was found in Isoform X1 from position 747 to 805, indicating that this segment may be aqueous and tightly bound. This mixed charge cluster is only partially conserved across
orthologs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a spec ...


Transcripts

C1orf112 is determined to have 9 transcripts, or splice variants by Ensembl.


Subcellular Localization

Antibody
immunocytochemistry Immunocytochemistry (ICC) is a common laboratory technique that is used to anatomically visualize the localization of a specific protein or antigen in cells by use of a specific primary antibody that binds to it. The primary antibody allows visual ...
and
immunofluorescent Immunofluorescence is a technique used for light microscopy with a fluorescence microscope and is used primarily on microbiological samples. This technique uses the specificity of antibodies to their antigen to target fluorescent dyes to specifi ...
staining of human cell line A-431 indicates C1orf112 is localized to the
mitochondria A mitochondrion (; ) is an organelle found in the Cell (biology), cells of most Eukaryotes, such as animals, plants and Fungus, fungi. Mitochondria have a double lipid bilayer, membrane structure and use aerobic respiration to generate adenosi ...
.


Regulation


Gene level regulation


Expression

Although tissue-level expression is ubiquitous, C1orf112 is expressed highest in the
testes A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testoster ...
,
lymph node A lymph node, or lymph gland, is a kidney-shaped organ of the lymphatic system and the adaptive immune system. A large number of lymph nodes are linked throughout the body by the lymphatic vessels. They are major sites of lymphocytes that inclu ...
s,
brain marrow A brain is an organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It is located in the head, usually close to the sensory organs for senses such as vision. It is the most complex organ in a ver ...
, and
cerebellum The cerebellum (Latin for "little brain") is a major feature of the hindbrain of all vertebrates. Although usually smaller than the cerebrum, in some animals such as the mormyrid fishes it may be as large as or even larger. In humans, the cerebel ...
, with samples from 97 individual in 27 different tissues.
In-situ hybridization ''In situ'' hybridization (ISH) is a type of hybridization that uses a labeled complementary DNA, RNA or modified nucleic acids strand (i.e., probe) to localize a specific DNA or RNA sequence in a portion or section of tissue (''in situ'') or ...
of the human
transcriptome The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The t ...
indicates expression is highest in the
atrioventricular node The atrioventricular node or AV node electrically connects the heart's atria and ventricles to coordinate beating in the top of the heart; it is part of the electrical conduction system of the heart. The AV node lies at the lower back section of t ...
, followed by the
testis A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testostero ...
, testis
germ cell Germ or germs may refer to: Science * Germ (microorganism), an informal word for a pathogen * Germ cell, cell that gives rise to the gametes of an organism that reproduces sexually * Germ layer, a primary layer of cells that forms during embry ...
s, and testis interstitial tissue.


Transcript level regulation

Transcription factor assessment indicates many potential
TATA-binding protein The TATA-binding protein (TBP) is a general transcription factor that binds specifically to a DNA sequence called the TATA box. This DNA sequence is found about 30 base pairs upstream of the transcription start site in some eukaryotic gene pr ...
and
CCAAT-enhancer-binding proteins CCAAT-enhancer-binding proteins (or C/EBPs) is a family of transcription factors composed of six members, named from C/EBPα to C/EBPζ. They promote the expression of certain genes through interaction with their promoters. Once bound to DNA, C ...
sites, along with transcription factors associated with the
testis A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testostero ...
,
thymus The thymus is a specialized primary lymphoid organ of the immune system. Within the thymus, thymus cell lymphocytes or ''T cells'' mature. T cells are critical to the adaptive immune system, where the body adapts to specific foreign invaders. ...
,
kidney The kidneys are two reddish-brown bean-shaped organs found in vertebrates. They are located on the left and right in the retroperitoneal space, and in adult humans are about in length. They receive blood from the paired renal arteries; blood ...
s, and
cardiac The heart is a muscular organ in most animals. This organ pumps blood through the blood vessels of the circulatory system. The pumped blood carries oxygen and nutrients to the body, while carrying metabolic waste such as carbon dioxide to t ...
tissue.


Protein level regulation

There are two
ubiquitination Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Fo ...
sites on C1orf112, at position
lysine Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. It contains an α-amino group (which is in the protonated form under biological conditions), an α-carboxylic acid group (which is in the deprotonated −C ...
73 and at position 783 on
isoform A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene or gene family and are the result of genetic differences. While many perform the same or similar biological roles, some isof ...
X1. Downstream of reading frame, there are three
polyadenylation Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
signals. In addition, there is an N6-acetyllysine site at
leucine Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α- ca ...
747 and a
phosphoserine Phosphoserine (abbreviated as SEP or J) is an ester of serine and phosphoric acid. Phosphoserine is a component of many proteins as the result of posttranslational modifications. The phosphorylation of the alcohol functional group in serine to pro ...
site at serine 23. C1orf112 has been found experimentally to interact with
ATG1 AuTophaGy related 1 (Atg1) is a 101.7kDa serine/threonine kinase in ''S.cerevisiae'', encoded by the gene ATG1. It is essential for the initial building of the autophagosome and Cvt vesicles. In a non-kinase role it is - through complex formation ...
, an
aldosterone Aldosterone is the main mineralocorticoid steroid hormone produced by the zona glomerulosa of the adrenal cortex in the adrenal gland. It is essential for sodium conservation in the kidney, salivary glands, sweat glands, and colon. It plays a c ...
secretion whose overexpression characterizes certain forms of breast cancer.
Post-translational modification Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribosome ...
s predictions include O-glycosyl-oligosaccharide-glycoprotein N-acetylglucosaminyltransferase III and
sumoylation In molecular biology, SUMO (Small Ubiquitin-like Modifier) proteins are a family of small proteins that are covalently attached to and detached from other proteins in cells to modify their function. This process is called SUMOylation (sometimes w ...
, and sumoylation interaction sites.


Interacting proteins

C1orf112 is predicted to interact with a diverse range of proteins, including multiple mitosis-associated proteins. C1orf112 is also predicted to interact with FIGNL1, a protein involved in DNA double-stranded break repair via
homologous recombination Homologous recombination is a type of genetic recombination in which genetic information is exchanged between two similar or identical molecules of double-stranded or single-stranded nucleic acids (usually DNA as in cellular organisms but may ...
. Experimental findings indicate C1orf112 interacts with
NUF2 Kinetochore protein Nuf2 is a protein that in humans is encoded by the ''NUF2'' gene. This gene encodes a protein that is highly similar to yeast Nuf2, a component of a conserved protein complex associated with the centromere. Yeast Nuf2 disappea ...
, a spindle-pole body protein that plays a critical role in nuclear division, and TTK, a protein
kinase In biochemistry, a kinase () is an enzyme that catalyzes the transfer of phosphate groups from high-energy, phosphate-donating molecules to specific substrates. This process is known as phosphorylation, where the high-energy ATP molecule don ...
capable of
phosphorylating In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, whi ...
serine Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
,
threonine Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form under biological conditions), a carboxyl group (which is in the deprotonated −COO ...
, and
tyrosine -Tyrosine or tyrosine (symbol Tyr or Y) or 4-hydroxyphenylalanine is one of the 20 standard amino acids that are used by cells to synthesize proteins. It is a non-essential amino acid with a polar side group. The word "tyrosine" is from the Gr ...
.


Homology/evolution


Paralogs

There are no known
paralogs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a spec ...
of C1orf112.


Orthologs

C1orf112 is highly conserved in ''Pan troglodytes'', '' Rhinopithecus bieti,'' ''Castor canadensis'', ''Miniopterus natalensis'', and other select
primate Primates are a diverse order of mammals. They are divided into the strepsirrhines, which include the lemurs, galagos, and lorisids, and the haplorhines, which include the tarsiers and the simians (monkeys and apes, the latter including huma ...
s, with percent identity relative to '' Homo sapien'' C1orf112, with percent identity greater than 90%. Ortholgs with the greatest date of divergence (date of
speciation Speciation is the evolutionary process by which populations evolve to become distinct species. The biologist Orator F. Cook coined the term in 1906 for cladogenesis, the splitting of lineages, as opposed to anagenesis, phyletic evolution within ...
) to human C1orf112 include ''Trichosporon asahii'', a
placozoa The Placozoa are a basal form of marine free-living (non-parasitic) multicellular organism. They are the simplest in structure of all animals. Three genera have been found: the classical ''Trichoplax adhaerens'', ''Hoilungia hongkongensis'', an ...
, and ''
Amphimedon queenslandica ''Amphimedon queenslandica'' (formerly known as ''Reniera'' sp.) is a sponge native to the Great Barrier Reef. Its genome has been sequenced. It has been the subject of various studies on the evolution of metazoan development. ''A. queenslandic ...
,'' indicated that C1orf112 has been preserved over evolutionary time. Date of divergence was calculated using TimeTree. The E value indicates the number of "hits" one can expect to see by chance when using the NCBI database, with a low E value indicated a significant result. Percent identity is the percentage of character that align to ''Homo sapien'' C1orf112 Isoform X1, while percent similarity is the degree of resemblance when the two sequences are aligned with one another.


Protein Structure


Secondary and Tertiary Structure

C1orf112 secondary structure is predicted to be predominately
alpha helical The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ear ...
, with < 5% of the protein composed of
beta sheet The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a g ...
s. Ligand binding sites are predicted by I-TASSER from positions 377 to 530 in Isoform X1. A
leucine zipper A leucine zipper (or leucine scissors) is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amin ...
motif is present in Isoform X1 from positions 831-852, predicted by MyHits.


Clinical significance

C1orf112 was one of many genes found to be co-expressed with cancer-associated genes, and the knockdown of this gene in a HeLa cell line suppressed growth.


References


Further reading

{{refend