C16orf71
   HOME

TheInfoList



OR:

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
in humans, encoded by the C16orf71
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
. The gene is expressed in
epithelial Epithelium or epithelial tissue is one of the four basic types of animal tissue, along with connective tissue, muscle tissue and nervous tissue. It is a thin, continuous, protective layer of compactly packed cells with a little intercellu ...
tissue of the
respiratory system The respiratory system (also respiratory apparatus, ventilatory system) is a biological system consisting of specific organs and structures used for gas exchange in animals and plants. The anatomy and physiology that make this happen varies g ...
,
adipose tissue Adipose tissue, body fat, or simply fat is a loose connective tissue composed mostly of adipocytes. In addition to adipocytes, adipose tissue contains the stromal vascular fraction (SVF) of cells including preadipocytes, fibroblasts, vascular ...
, and the
testes A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testoste ...
. Predicted associated biological processes of the gene include regulation of the
cell cycle The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and sub ...
,
cell proliferation Cell proliferation is the process by which ''a cell grows and divides to produce two daughter cells''. Cell proliferation leads to an exponential increase in cell number and is therefore a rapid mechanism of tissue growth. Cell proliferation r ...
,
apoptosis Apoptosis (from grc, ἀπόπτωσις, apóptōsis, 'falling off') is a form of programmed cell death that occurs in multicellular organisms. Biochemical events lead to characteristic cell changes ( morphology) and death. These changes in ...
, and
cell differentiation Cellular differentiation is the process in which a stem cell alters from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellular ...
in those tissue types. 1357 bp of the gene are
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context ...
to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.


Gene


Locus

The gene is located on the short arm of
chromosome 16 Chromosome 16 is one of the 23 pairs of chromosomes in humans. People normally have two copies of this chromosome. Chromosome 16 spans about 90 million base pairs (the building material of DNA) and represents just under 3% of the total DNA in cell ...
at 16p13.1. Its genomic sequence begins on the plus strand at 4,734,242 bp and ends at 4,749,396 bp.


mRNA


Alternative Splicing

Three different protein encoding transcript variants, or
isoforms A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene or gene family and are the result of genetic differences. While many perform the same or similar biological roles, some iso ...
, have been identified for C16orf7. One non-protein coding transcript variant was identified for the gene.


Protein


General properties

The primary encoded protein consists of 520
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
residues, 11 total exons, and is 15.14 kb long, with a molecular weight of approximately 55.68 kDa. The predicted
isoelectric point The isoelectric point (pI, pH(I), IEP), is the pH at which a molecule carries no net electrical charge or is electrically neutral in the statistical mean. The standard nomenclature to represent the isoelectric point is pH(I). However, pI is also ...
was reported to be 4.81, indicating it is relatively unstable. The gene was reported to be well expressed, at 1.1 times the average gene level.


Composition

Alanine Alanine (symbol Ala or A), or α-alanine, is an α-amino acid that is used in the biosynthesis of proteins. It contains an amine group and a carboxylic acid group, both attached to the central carbon atom which also carries a methyl group side ...
was the most abundant amino acid, contributing to 11.54% of the molecular weight of the protein.
Serine Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α- amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − for ...
was the second most abundant, contributing 10.19% to the overall molecular weight. The average Alanine frequency in vertebrate proteins is approximately 7.4% and the average Serine frequency is approximately 8.1%.


Domains

C16orf71 has one identified domain of unknown function, DUF4701, that is conserved in all mammals and some species of reptiles and birds. DUF4701 spans from amino acid residue 21 to 520 in the protein.


Post-translational modifications

C16orf71 is predicted to undergo multiple post-translational modifications such as
phosphorylation In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
, N-glycosylation, and
amidation In organic chemistry, an amide, also known as an organic amide or a carboxamide, is a compound with the general formula , where R, R', and R″ represent organic groups or hydrogen atoms. The amide group is called a peptide bond when it is p ...
.


Protein Interactions


Experimentally proven interactions

Experimentation with C16orf71 has revealed interactions with four other proteins, ARHGAP1, ZNFX1, PLVAP, and MBTPS1. ARHGAP1, ZNFX1, and MBTPS1 are associated with regulation in
signaling In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The '' IEEE Transactions on Signal Processing' ...
and
metabolism Metabolism (, from el, μεταβολή ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run ...
while PLVAP is associated with the formation of small
lipid raft The plasma membranes of cells contain combinations of glycosphingolipids, cholesterol and protein receptors organised in glycolipoprotein lipid microdomains termed lipid rafts. Their existence in cellular membranes remains somewhat controversial ...
s in the
plasma membrane The cell membrane (also known as the plasma membrane (PM) or cytoplasmic membrane, and historically referred to as the plasmalemma) is a biological membrane that separates and protects the interior of all cells from the outside environment (t ...
of vertebrate
endothelial The endothelium is a single layer of squamous endothelial cells that line the interior surface of blood vessels and lymphatic vessels. The endothelium forms an interface between circulating blood or lymph in the lumen and the rest of the ve ...
and
adipose Adipose tissue, body fat, or simply fat is a loose connective tissue composed mostly of adipocytes. In addition to adipocytes, adipose tissue contains the stromal vascular fraction (SVF) of cells including preadipocytes, fibroblasts, vascular ...
cells.


Predicted interactions

The majority of the predicted interactions involved with the protein related to regulation of mitotic processes, cellular differentiation, proliferation, metabolism, and signaling. Additional related processes included the formation and differentiation of
B cell B cells, also known as B lymphocytes, are a type of white blood cell of the lymphocyte subtype. They function in the humoral immunity component of the adaptive immune system. B cells produce antibody molecules which may be either secreted o ...
s,
T cell A T cell is a type of lymphocyte. T cells are one of the important white blood cells of the immune system and play a central role in the adaptive immune response. T cells can be distinguished from other lymphocytes by the presence of a T-cell r ...
s,
endothelial The endothelium is a single layer of squamous endothelial cells that line the interior surface of blood vessels and lymphatic vessels. The endothelium forms an interface between circulating blood or lymph in the lumen and the rest of the ve ...
cells,
endoderm Endoderm is the innermost of the three primary germ layers in the very early embryo. The other two layers are the ectoderm (outside layer) and mesoderm (middle layer). Cells migrating inward along the archenteron form the inner layer of the gast ...
, and
endocrine gland Endocrine glands are ductless glands of the endocrine system that secrete their products, hormones, directly into the blood. The major glands of the endocrine system include the pineal gland, pituitary gland, pancreas, ovaries, testes, thy ...
s.


Subcellular localization

C16orf71 was observed in nuclear speckles of the
nucleus Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to: * Atomic nucleus, the very dense central region of an atom *Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA Nucl ...
through experimental protocols involving
fluorescent in situ hybridization Fluorescence ''in situ'' hybridization (FISH) is a molecular cytogenetic technique that uses fluorescent probes that bind to only particular parts of a nucleic acid sequence with a high degree of sequence complementarity. It was developed by ...
with antibodies.
Nuclear speckles The cell nucleus (pl. nuclei; from Latin or , meaning ''kernel'' or ''seed'') is a membrane-bound organelle found in eukaryotic cells. Eukaryotic cells usually have a single nucleus, but a few cell types, such as mammalian red blood cells, ha ...
, also known as interchromatin granule clusters, are enriched in pre-mRNA splicing factors. These highly dynamic structures are located in interchromatin regions of the nucleoplasm in mammalian cells and have been observed to cycle throughout various nuclear regions and active transcription sites.


Structure

The
secondary structure Protein secondary structure is the three dimensional form of ''local segments'' of proteins. The two most common secondary structural elements are alpha helices and beta sheets, though beta turns and omega loops occur as well. Secondary struct ...
of C16orf71 is predicted to consist primarily of coils, with small regions of
alpha helices The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ear ...
and two segments of
beta sheet The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a ge ...
s throughout the span of the protein. Protein sequences of the gene's mammalian orthologs were analyzed to reveal similar results, while distant reptilian and avian ortholog sequences predicted more regions of beta sheets.


Expression


Tissue expression pattern

Human
expression Expression may refer to: Linguistics * Expression (linguistics), a word, phrase, or sentence * Fixed expression, a form of words with a specific meaning * Idiom, a type of fixed expression * Metaphorical expression, a particular word, phrase, o ...
for the gene has been observed primarily in respiratory
epithelial Epithelium or epithelial tissue is one of the four basic types of animal tissue, along with connective tissue, muscle tissue and nervous tissue. It is a thin, continuous, protective layer of compactly packed cells with a little intercellu ...
tissue, specifically the
trachea The trachea, also known as the windpipe, is a cartilaginous tube that connects the larynx to the bronchi of the lungs, allowing the passage of air, and so is present in almost all air- breathing animals with lungs. The trachea extends from t ...
,
larynx The larynx (), commonly called the voice box, is an organ in the top of the neck involved in breathing, producing sound and protecting the trachea against food aspiration. The opening of larynx into pharynx known as the laryngeal inlet is about ...
,
nasopharynx The pharynx (plural: pharynges) is the part of the throat behind the mouth and nasal cavity, and above the oesophagus and trachea (the tubes going down to the stomach and the lungs). It is found in vertebrates and invertebrates, though its struct ...
, and
bronchus A bronchus is a passage or airway in the lower respiratory tract that conducts air into the lungs. The first or primary bronchi pronounced (BRAN-KAI) to branch from the trachea at the carina are the right main bronchus and the left main bronchu ...
. C16orf71 is also moderately expressed in
adipose tissue Adipose tissue, body fat, or simply fat is a loose connective tissue composed mostly of adipocytes. In addition to adipocytes, adipose tissue contains the stromal vascular fraction (SVF) of cells including preadipocytes, fibroblasts, vascular ...
and
testes A testicle or testis (plural testes) is the male reproductive gland or gonad in all bilaterians, including humans. It is homologous to the female ovary. The functions of the testes are to produce both sperm and androgens, primarily testoste ...
.


DNA microarray experimental data

DNA microarray analysis from various experiments provided information on the expression levels of C16orf71 in unique, varying conditions. The gene appears to have higher levels of expression in the omental adipose tissue of obese subjects compared to non-obese subjects. C16orf71 was also observed to have decreased expression when there was a depletion of HIF-1 alpha, HIF-2 beta, or both. HIF, or
hypoxia-inducible factors Hypoxia-inducible factors (HIFs) are transcription factors that respond to decreases in available oxygen in the cellular environment, or hypoxia. They are only present in parahoxozoan animals. Discovery The HIF transcriptional complex w ...
, are responsible for the mediation of hypoxia effects within the body. In addition, HIFs promote clotting and restoration of various epithelial tissues and are vital in the development of mammalian embryos, sperm, and ova. Data from an experiment also indicated noticeably lower expression of the gene in sperm affected with
teratozoospermia Teratospermia or teratozoospermia is a condition characterized by the presence of sperm with abnormal morphology that affects fertility in males. Causes The causes of teratozoospermia are unknown in most cases. However, Hodgkin's disease, coeliac ...
, a condition where sperm have abnormal morphology affecting the fertility in males, compared to normal sperm. C16orf71 was observed to be present in all stages of development, with similar levels of expression throughout.


Toxicogenomics experimental data

Three chemicals,
bisphenol A Bisphenol A (BPA) is a chemical compound primarily used in the manufacturing of various plastics. It is a colourless solid which is soluble in most common organic solvents, but has very poor solubility in water. BPA is produced on an industrial ...
,
butyraldehyde Butyraldehyde, also known as butanal, is an organic compound with the formula CH3(CH2)2CHO. This compound is the aldehyde derivative of butane. It is a colorless flammable liquid with an unpleasant smell. It is miscible with most organic solvent ...
, and
polychlorinated biphenyl Polychlorinated biphenyls (PCBs) are highly carcinogenic chemical compounds, formerly used in industrial and consumer products, whose production was banned in the United States by the Toxic Substances Control Act of 1976, Toxic Substances Contro ...
s, have been experimentally tested with C16orf71 for evidence of interaction.
Bisphenol A Bisphenol A (BPA) is a chemical compound primarily used in the manufacturing of various plastics. It is a colourless solid which is soluble in most common organic solvents, but has very poor solubility in water. BPA is produced on an industrial ...
is suspected to cause impairment in male reproduction. An experiment utilizing
seminiferous tubule Seminiferous tubules are located within the testes, and are the specific location of meiosis, and the subsequent creation of male gametes, namely spermatozoa. Structure The epithelium of the tubule consists of a type of sustentacular cells known ...
culture was conducted to observe the effects on meiosis and potential germ-line abnormalities. Gene expression analysis revealed decrease expression for C16orf71 when exposed to the chemical.
Butyraldehyde Butyraldehyde, also known as butanal, is an organic compound with the formula CH3(CH2)2CHO. This compound is the aldehyde derivative of butane. It is a colorless flammable liquid with an unpleasant smell. It is miscible with most organic solvent ...
has been observed to affect inflammatory responses in bronchial airway tissue on a genetic level. Microarray analysis was used to determine levels of expression in human alveolar epithelial cells after exposure to the compound. Results indicated decreased expression for C16orf71 when exposed to the chemical.
Polychlorinated biphenyl Polychlorinated biphenyls (PCBs) are highly carcinogenic chemical compounds, formerly used in industrial and consumer products, whose production was banned in the United States by the Toxic Substances Control Act of 1976, Toxic Substances Contro ...
was used in an experiment to determine its effects on external male genital development. Human fetal corpora cavernosa cells were used as the model tissue. Toxicogenomic analysis indicated the chemical affected all genes involved with
genitourinary The genitourinary system, or urogenital system, are the organs of the reproductive system and the urinary system. These are grouped together because of their proximity to each other, their common embryological origin and the use of common pathw ...
development and revealed lowered expression levels for C16orf71.


Regulation of expression

1357 bp of the gene are
antisense In molecular biology and genetics, the sense of a nucleic acid molecule, particularly of a strand of DNA or RNA, refers to the nature of the roles of the strand and its complement in specifying a sequence of amino acids. Depending on the context ...
to spliced genes ZNF500 and ANKS3, indicating possibility of regulated alternate expression. A ZNF500 transcription factor binding domain was found on the minus strand within the promoter region of the gene. ZNF500 is predicted to play a role in gene regulation, transcription, and cellular differentiation. The beginning of the
promoter region In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein ( mRNA), or can have a function in and ...
was predicted to be 117 bp upstream from the 5' UTR of C16orf71 and is 1371 bp long. The region was analyzed for predicted transcription factors and regulatory elements. Predicted
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s in the promoter region related to the regulation of the
cell cycle The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA (DNA replication) and some of its organelles, and sub ...
, proliferation,
apoptosis Apoptosis (from grc, ἀπόπτωσις, apóptōsis, 'falling off') is a form of programmed cell death that occurs in multicellular organisms. Biochemical events lead to characteristic cell changes ( morphology) and death. These changes in ...
, and differentiation of sperm and epithelial tissue components.


Predicted transcription factors


Homology


Paralogs

No human
paralog Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
s for the gene were found.


Orthologs

Orthologs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
have been identified in most mammals for which complete genome data is available. C16orf71 and its domain of unknown function, DUF4701, was present in mammals. The most distant orthologs identified were reptilian.


Molecular evolution

The ''m'' value, or number of corrected amino acid changes per 100 residues, for the gene C16orf71 was plotted against the divergence of species in millions of years. When compared to the data of
hemoglobin Hemoglobin (haemoglobin BrE) (from the Greek word αἷμα, ''haîma'' 'blood' + Latin ''globus'' 'ball, sphere' + ''-in'') (), abbreviated Hb or Hgb, is the iron-containing oxygen-transport metalloprotein present in red blood cells (erythroc ...
,
fibrinopeptide The fibrinopeptides, fibrinopeptide A (FpA) and fibrinopeptide B (FpB), are peptides which are located in the central region of the fibrous glycoprotein fibrinogen (factor I) and are cleaved by the enzyme thrombin (factor IIa) to convert fibr ...
s, and
cytochrome C The cytochrome complex, or cyt ''c'', is a small hemeprotein found loosely associated with the inner membrane of the mitochondrion. It belongs to the cytochrome c family of proteins and plays a major role in cell apoptosis. Cytochrome c is hig ...
, it was determined that the gene has the closest progression to fibrinopeptides, suggesting a relatively rapid pace of
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
. ''M'' values for C16orf71 were derived from percentage of identity of species mRNA sequences compared to the human sequence using the formula derived from the
Molecular Clock Hypothesis The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotid ...
.


References

{{reflist Human proteins Genes on human chromosome 16