Chromosome 18 open reading frame 63 is a
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
which in humans is encoded by the C18orf63
gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
.
This protein is not yet well understood by the scientific community. Research has been conducted suggesting that C18orf63 could be a potential
biomarker
In biomedical contexts, a biomarker, or biological marker, is a measurable indicator of some biological state or condition. Biomarkers are often measured and evaluated using blood, urine, or soft tissues to examine normal biological processes, p ...
for early stage
pancreatic cancer and
breast cancer
Breast cancer is cancer that develops from breast tissue. Signs of breast cancer may include a lump in the breast, a change in breast shape, dimpling of the skin, milk rejection, fluid coming from the nipple, a newly inverted nipple, or a r ...
.
Gene
This gene is located at band 22, sub-band 3, on the long arm of
chromosome 18. It is composed of 5065
base pairs spanning from 74,315,875 to 74,359,187 bp on chromosome 18.
The gene has a total of 14
exons.
C18orf63 is also known by the alias DKFZP78G0119. No isoforms exist for this gene.
Expression
C18orf63 has high expression in the
testis
A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testostero ...
.
The gene shows low expression in the kidneys, liver, lung, and pelvis. There is no
phenotype
In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological pr ...
associated with this gene.
Promoter
The
promoter region
In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of ...
for C18orf63 is 1163 bp long starting at 74,314,813 bp and ending at 74,315,975 bp.
The promoter ID is GXP_4417391. The presence of multiple y-box binding transcription factors and SRY transcription factor binding sites suggest that C18orf63 is involved in male sex determination.
Protein
The C18orf63 protein is composed up of 685
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha a ...
s and has a molecular weight of 77230.50 Da, with a predicted
isoelectric point
The isoelectric point (pI, pH(I), IEP), is the pH at which a molecule carries no net electrical charge or is electrically neutral in the statistical mean. The standard nomenclature to represent the isoelectric point is pH(I). However, pI is also u ...
of 9.83.
No
isoforms
A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene or gene family and are the result of genetic differences. While many perform the same or similar biological roles, some iso ...
exist for this protein.
This protein is rich in
glutamine
Glutamine (symbol Gln or Q) is an α-amino acid that is used in the biosynthesis of proteins. Its side chain is similar to that of glutamic acid, except the carboxylic acid group is replaced by an amide. It is classified as a charge-neutral ...
,
isoleucine,
lysine, and
serine when compared to the average protein, but lacks in
aspartic acid and
glycine
Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid ( carbamic acid is unstable), with the chemical formula NH2‐ CH2‐ COOH. Glycine is one of the proteinog ...
.
Structure
In the predicted secondary structure for this protein there are a number of
beta turn β turns (also β-bends, tight turns, reverse turns, Venkatachalam turns) are the most common form of turns—a type of non-regular secondary structure in proteins that cause a change in direction of the polypeptide chain. They are very common mot ...
s,
beta strands and
alpha helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ear ...
. For C18orf63 48.6% of the protein is expected to form alpha helices and 28.6% of the structure is expected to be composed of beta strands.
Domains and Motifs
The protein contains one
domain of unknown function, DUF 4709, spanning from the 7th amino acid to the 280th amino acid.
Motifs that are predicted to exist include an N-terminal motif, RxxL motif, and KEN conserving motif, which all signal for
protein degradation
Proteolysis is the breakdown of proteins into smaller polypeptides or amino acids. Uncatalysed, the hydrolysis of peptide bonds is extremely slow, taking hundreds of years. Proteolysis is typically catalysed by cellular enzymes called protease ...
. Another motif that is predicted to exist is a Wxxx motif, which facilitates entrance of PTS1 cargo proteins into the organellar lumen, and a RVxPx motif which allows protein transport from the
trans-Golgi network
The Golgi apparatus (), also known as the Golgi complex, Golgi body, or simply the Golgi, is an organelle found in most eukaryotic cells. Part of the endomembrane system in the cytoplasm, it packages proteins into membrane-bound vesicles in ...
to the
plasma membrane of the
cilia. There is also a bipartite
nuclear localization signal A nuclear localization signal ''or'' sequence (NLS) is an amino acid sequence that 'tags' a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines o ...
at the end of the protein sequence. There is no trans-membrane domain present, indicating that C18orf63 is not a trans-membrane protein.
Post-Translational Modifications
Post-translational modification
Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribos ...
s the protein is predicted to undergo include
SUMOylation, PKC and CK2
phosphorylation,
N-glycosylation
''N''-linked glycosylation, is the attachment of an oligosaccharide, a carbohydrate consisting of several sugar molecules, sometimes also referred to as glycan, to a nitrogen atom (the amide nitrogen of an asparagine (Asn) residue of a protein), ...
, amiditation, and cleavage.
There are six total PKC phosphorylation sites and 2 CK2 phosphorylation sites, 2 SUMOylation sites, and 2 N-glycosylation sites. There are no signal peptides present in this sequence.
Subcellular Location
Due to the nuclear localization signal at the end of the protein sequence, C18orf63 is predicted to be
nuclear. C18orf63 has also been predicted to be targeted to the
mitochondria in addition to the nucleus.
Homology
Orhologs
Ortholog
Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a s ...
s have been found in most
eukaryotes, with the exception of the class ''
Amphibia
Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia. All living amphibians belong to the group Lissamphibia. They inhabit a wide variety of habitats, with most species living within terrestrial, fossorial, arbor ...
''.
No human
paralogs
Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
exist for C18orf63.
The most distant homolog detectable is ''
Mizuhopecten yessoensis
''Mizuhopecten yessoensis'' (Yesso scallop, giant Ezo scallop) is a species of marine bivalve mollusks in the family Pectinidae, the scallops. Its name Yesso/Ezo refers to its being found north of Japan.
Its tissues bioaccumulate algal yesso ...
'', sharing a 37% identity with the human protein sequence. The domain of unknown function was the only homologous domain present in the protein sequence, it was found to be highly conserved in all orthologs. The table below shows some examples of various orthologs for this protein.
Rate of Evolution
C18orf63 is a mildly slow evolving protein. The protein evolves faster than
Cytochorme C but slower than
Betaglobin.
Interacting proteins
Transcription factors of interest predicted to bind to the regulatory sequence include
p53 tumor suppressors,
SRY testis determining factors,
Y-box binding transcription factors, and
glucocorticoid responsive elements.
The JUN protein was found to interact with C18orf63 through antibait
co-immunoprecipitation. The JUN protein binds to the USP28 promoter in
colorectal cancer cells and is involved in the activation of these cancer cells.
Clinical significance
Mutations
A variety of
missense mutation
In genetics, a missense mutation is a point mutation in which a single nucleotide change results in a codon that codes for a different amino acid. It is a type of nonsynonymous substitution.
Substitution of protein from DNA mutations
Missense m ...
s occur in the human population for this protein. In the regulatory sequence missense mutations occur at two transcription factor binding sites.
Transcription factors affected are
glucocorticoid responsive elements and
E2F-myc cell cycle regulars. There are eleven common mutations that occur that affect the protein sequence itself.
None of these mutations affect predicted post-translational modifications the protein sequence undergoes.
Disease association
C18orf63 has been associated with
personality disorder
Personality disorders (PD) are a class of mental disorders characterized by enduring maladaptive patterns of behavior, cognition, and inner experience, exhibited across many contexts and deviating from those accepted by the individual's culture ...
s,
obesity
Obesity is a medical condition, sometimes considered a disease, in which excess body fat has accumulated to such an extent that it may negatively affect health. People are classified as obese when their body mass index (BMI)—a person's ...
, and
type two diabetes
Type 2 diabetes, formerly known as adult-onset diabetes, is a form of diabetes mellitus that is characterized by high blood sugar, insulin resistance, and relative lack of insulin. Common symptoms include increased thirst, frequent urination, ...
through a
genome-wide association study
In genomics, a genome-wide association study (GWA study, or GWAS), also known as whole genome association study (WGA study, or WGAS), is an observational study of a genome-wide set of genetic variants in different individuals to see if any vari ...
.
Currently research has not shown if C18orf63 plays a direct role in any of these diseases.
References
{{Reflist, 32em
Chromosomes
Proteins