ORF7a (also known by several other names, including SARS coronavirus X4, SARS-X4, ORF7a, or U122)
is a
gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
found in
coronaviruses of the ''
Betacoronavirus
''Betacoronavirus'' (β-CoVs or Beta-CoVs) is one of four genera ('' Alpha''-, ''Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals (of which humans are part ...
''
genus
Genus ( plural genera ) is a taxonomic rank used in the biological classification of living and fossil organisms as well as viruses. In the hierarchy of biological classification, genus comes above species and below family. In binomial n ...
. It
expresses
Expression may refer to:
Linguistics
* Expression (linguistics), a word, phrase, or sentence
* Fixed expression, a form of words with a specific meaning
* Idiom, a type of fixed expression
* Metaphorical expression, a particular word, phrase, o ...
the Betacoronavirus NS7A protein, a type I
transmembrane protein
A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequentl ...
with an
immunoglobulin
An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
-like
protein domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist o ...
. It was first discovered in
SARS-CoV
Severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1; or Severe acute respiratory syndrome coronavirus, SARS-CoV) is a strain of coronavirus that causes severe acute respiratory syndrome (SARS), the respiratory illness responsible for ...
, the virus that causes
severe acute respiratory syndrome
Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the severe acute respiratory syndrome coronavirus (SARS-CoV or SARS-CoV-1), the first identified strain of the SARS coronavirus species, '' se ...
(SARS).
The
homolog
In biology, homology is similarity due to shared ancestry between a pair of structures or genes in different taxa. A common example of homologous structures is the forelimbs of vertebrates, where the Bat wing development, wings of bats and Ori ...
in
SARS-CoV-2
Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19 (coronavirus disease 2019), the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus previously had a No ...
, the virus that causes
COVID-19
Coronavirus disease 2019 (COVID-19) is a contagious disease caused by a virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was identified in Wuhan, China, in December 2019. The disease quickl ...
, has about 85%
sequence identity
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Alig ...
to the SARS-CoV protein.
Function
A number of possible functions for the ORF7a protein have been described. The primary function is thought to be
immunomodulation Immunomodulation is modulation (regulatory adjustment) of the immune system. It has natural and human-induced forms, and thus the word can refer to the following:
* Homeostasis in the immune system, whereby the system self-regulates to adjust immun ...
and
interferon
Interferons (IFNs, ) are a group of signaling proteins made and released by host cells in response to the presence of several viruses. In a typical scenario, a virus-infected cell will release interferons causing nearby cells to heighten t ...
antagonism. The protein is not
essential for
viral replication
Viral replication is the formation of biological viruses during the infection process in the target host cells. Viruses must first get into the cell before viral replication can occur. Through the generation of abundant copies of its genome a ...
.
Viral protein interactions
Studies in SARS-CoV suggest that the protein forms
protein-protein interactions with
spike protein
In virology, a spike protein or peplomer protein is a protein that forms a large structure known as a spike or peplomer projecting from the surface of an enveloped virus. as cited in The proteins are usually glycoproteins that form dimers o ...
and
ORF3a
ORF3a (previously known as X1 or U274) is a gene found in coronaviruses of the subgenus ''Sarbecovirus'', including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a vi ...
, and is present in mature
virion
A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea.
Since Dmitri Ivanovsky ...
s, making it a minor
viral structural protein A viral structural protein is a viral protein that is a structural component of the mature virus.
Examples include the SARS coronavirus 3a and 7a accessory proteins.
Bacteriophage T4 structural proteins
During assembly of the bacteriophage (pha ...
.
It is unclear if this occurs in SARS-CoV-2.
It may have a role in viral assembly.
Host effects
A number of interactions with host proteins and effects on
host cell
In biology and medicine, a host is a larger organism that harbours a smaller organism; whether a parasitic, a mutualistic, or a commensalist ''guest'' (symbiont). The guest is typically provided with nourishment and shelter. Examples include a ...
processes have been described. The SARS-CoV ORF7a protein has been reported to have
binding
Binding may refer to:
Computing
* Binding, associating a network socket with a local port number and IP address
* Data binding, the technique of connecting two data elements together
** UI data binding, linking a user interface element to an eleme ...
activity to
integrin
Integrins are transmembrane receptors that facilitate cell-cell and cell-extracellular matrix (ECM) adhesion. Upon ligand binding, integrins activate signal transduction pathways that mediate cellular signals such as regulation of the cell cycle, ...
I
domains.
It has also been reported to induce
apoptosis via a
caspase
Caspases (cysteine-aspartic proteases, cysteine aspartases or cysteine-dependent aspartate-directed proteases) are a family of protease enzymes playing essential roles in programmed cell death. They are named caspases due to their specific cyst ...
dependent
pathway.
Also, it contains a
motif
Motif may refer to:
General concepts
* Motif (chess composition), an element of a move in the consideration of its purpose
* Motif (folkloristics), a recurring element that creates recognizable patterns in folklore and folk-art traditions
* Moti ...
which has been demonstrated to mediate
COPII
The Coat Protein Complex II, or COPII, is a group of proteins that facilitate the formation of vesicles to transport proteins from the endoplasmic reticulum to the Golgi apparatus or endoplasmic-reticulum–Golgi intermediate compartment. This pr ...
dependent transport out of the
endoplasmic reticulum
The endoplasmic reticulum (ER) is, in essence, the transportation system of the eukaryotic cell, and has many other important functions such as protein folding. It is a type of organelle made up of two subunits – rough endoplasmic reticulum ( ...
, and the protein is targeted to the Golgi apparatus.
In SARS-CoV-2, ORF7a protein has been described as an effective
interferon
Interferons (IFNs, ) are a group of signaling proteins made and released by host cells in response to the presence of several viruses. In a typical scenario, a virus-infected cell will release interferons causing nearby cells to heighten t ...
antagonist.
The SARS-CoV-2 protein may have
immunomodulatory
Immunotherapy or biological therapy is the treatment of disease by activating or suppressing the immune system. Immunotherapies designed to elicit or amplify an immune response are classified as ''activation immunotherapies,'' while immunotherap ...
effects through interaction with
monocyte
Monocytes are a type of leukocyte or white blood cell. They are the largest type of leukocyte in blood and can differentiate into macrophages and conventional dendritic cells. As a part of the vertebrate innate immune system monocytes also i ...
s.
Structure
The ORF7a protein is a
transmembrane protein
A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequentl ...
with 121
amino acid residue
Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer may ...
s in SARS-CoV-2
and 122 in SARS-CoV.
It is a type I
transmembrane protein
A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequentl ...
with an
N-terminal
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the amin ...
signal peptide
A signal peptide (sometimes referred to as signal sequence, targeting signal, localization signal, localization sequence, transit peptide, leader sequence or leader peptide) is a short peptide (usually 16-30 amino acids long) present at the N-ter ...
, an
ectodomain
An ectodomain is the domain of a membrane protein that extends into the extracellular space (the space outside a cell). Ectodomains are usually the parts of proteins that initiate contact with surfaces, which leads to signal transduction.A notable ...
that has an
immunoglobulin fold
The immunoglobulin domain, also known as the immunoglobulin fold, is a type of protein domain that consists of a 2-layer sandwich of 7-9 antiparallel β-strands arranged in two β-sheets with a Greek key topology, consisting of about 125 amino ac ...
, and a
C-terminal
The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein is ...
endoplasmic reticulum
The endoplasmic reticulum (ER) is, in essence, the transportation system of the eukaryotic cell, and has many other important functions such as protein folding. It is a type of organelle made up of two subunits – rough endoplasmic reticulum ( ...
retention signal sequence.
The structure contains seven beta strands which form two
beta sheets
The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a gen ...
, arranged in a
beta sandwich
Beta-sandwich, β-sandwich domains consisting of 80 to 350 amino acids occur commonly in proteins. They are characterized by two opposing antiparallel beta sheets (β-sheets). The number of strands found in such domains may differ from one protein ...
.
Most of the sequence differences between SARS-CoV and SARS-CoV-2 occur in the Ig-like ectodomain and may produce differences in
protein-protein interactions.
Post-translational modifications
The SARS-CoV-2 ORF7a protein has been reported to be
post-translationally modified
Post-translational modification (PTM) is the covalent and generally enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by ribosomes ...
by
ubiquitination
Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. F ...
. Polyubiquitin chains attached to
lysine
Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. It contains an α-amino group (which is in the protonated form under biological conditions), an α-carboxylic acid group (which is in the deprotonated &minu ...
119 may be related to the protein's reported
interferon
Interferons (IFNs, ) are a group of signaling proteins made and released by host cells in response to the presence of several viruses. In a typical scenario, a virus-infected cell will release interferons causing nearby cells to heighten t ...
antagonism.
Expression and localization
Along with the genes for other
viral accessory protein A viral regulatory and accessory protein is a type of viral protein that can play an indirect role in the function of a virus.
An example is Nef
Nef or NEF may refer to:
Businesses and organizations
* National Energy Foundation, a British chari ...
s, the ORF7a gene is located near those encoding the
viral structural protein A viral structural protein is a viral protein that is a structural component of the mature virus.
Examples include the SARS coronavirus 3a and 7a accessory proteins.
Bacteriophage T4 structural proteins
During assembly of the bacteriophage (pha ...
s, at the
5' end of the coronavirus RNA genome.
ORF7a is an
overlapping gene
An overlapping gene (or OLG) is a gene whose expressible nucleotide sequence partially overlaps with the expressible nucleotide sequence of another gene. In this way, a nucleotide sequence may make a contribution to the function of one or more gen ...
that overlaps
ORF7b.
In SARS-CoV,
subcellular localization to the
endoplasmic reticulum
The endoplasmic reticulum (ER) is, in essence, the transportation system of the eukaryotic cell, and has many other important functions such as protein folding. It is a type of organelle made up of two subunits – rough endoplasmic reticulum ( ...
,
Golgi apparatus
The Golgi apparatus (), also known as the Golgi complex, Golgi body, or simply the Golgi, is an organelle found in most eukaryotic cells. Part of the endomembrane system in the cytoplasm, it packages proteins into membrane-bound vesicles ...
, and
ERGIC
The vesicular-tubular cluster (VTC), also referred to as the endoplasmic-reticulum–Golgi intermediate compartment (ERGIC), is an organelle in eukaryotic cells. This compartment mediates trafficking between the endoplasmic reticulum (ER) and Golgi ...
has been reported,
with similar Golgi localization described for SARS-CoV-2.
Evolution

It is thought that ''
ORF8
ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus ''Sarbecovirus''. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion sp ...
'' in SARS-CoV-2, which encodes a protein with a similar Ig-like fold, may be a
paralog
Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
of ORF7a that originated through
gene duplication
Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene ...
,
though some
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combin ...
analyses suggest the similarity may be too low to support duplication, which is relatively uncommon in viruses.
Immunoglobulin domains are uncommon in coronaviruses; other than the subset of
betacoronavirus
''Betacoronavirus'' (β-CoVs or Beta-CoVs) is one of four genera ('' Alpha''-, ''Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals (of which humans are part ...
es with ORF8 and ORF7a, only a small number of bat
alphacoronavirus
Alphacoronaviruses (Alpha-CoV) are members of the first of the four genera (''Alpha''-, '' Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. They are positive-sense, single-stranded RNA viruses that infect mammals, including humans. The ...
es have been identified as containing likely Ig domains, while they are absent from
gammacoronavirus
''Gammacoronavirus'' (Gamma-CoV) is one of the four genera ('' Alpha''-, '' Beta-'', ''Gamma-'', and '' Delta-'') of coronaviruses. It is in the subfamily ''Orthocoronavirinae'' of the family ''Coronaviridae''. They are enveloped, positive-sen ...
es and
deltacoronaviruses.
The beta and alpha Ig domains may be independent acquisitions, where ORF8 and ORF7a may have been acquired from host proteins.
Many SARS-CoV-2 genomes have been sequenced throughout the
COVID-19 pandemic
The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identified ...
and a number of variations have been reported, including
deletion mutation
In genetics, a deletion (also called gene deletion, deficiency, or deletion mutation) (sign: Δ) is a mutation (a genetic aberration) in which a part of a chromosome or a sequence of DNA is left out during DNA replication. Any number of nucleot ...
s,
nonsense mutation
In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a ''nonsense codon'' in the transcribed mRNA, and in leading to a truncated, incomplete, and usually nonfunctional protein produc ...
s (introducing a premature
stop codon
In molecular biology (specifically protein biosynthesis), a stop codon (or termination codon) is a codon ( nucleotide triplet within messenger RNA) that signals the termination of the translation process of the current protein. Most codons in ...
and truncating the protein),
and at least one
gene fusion A fusion gene is a hybrid gene formed from two previously independent genes. It can occur as a result of translocation, interstitial deletion, or chromosomal inversion. Fusion genes have been found to be prevalent in all main types of human neopla ...
.
References
{{Viral proteins
Protein domains
Coronavirus proteins