ORF8 is a
gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
that encodes a
viral accessory protein A viral regulatory and accessory protein is a type of viral protein
A viral protein is both a component and a product of a virus. Viral proteins are grouped according to their functions, and groups of viral proteins include structural proteins, n ...
,
Betacoronavirus
''Betacoronavirus'' (β-CoVs or Beta-CoVs) is one of four genera (''Alpha''-, ''Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals (of which humans are part). ...
NS8 protein, in
coronaviruses of the
subgenus ''
Sarbecovirus
''Severe acute respiratory syndrome–related coronavirus'' (SARSr-CoV or SARS-CoV)The terms ''SARSr-CoV'' and ''SARS-CoV'' are sometimes used interchangeably, especially prior to the discovery of SARS-CoV-2. This may cause confusion when some ...
''. It is one of the least well
conserved and most variable parts of the
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding g ...
.
In some viruses, a
deletion splits the region into two smaller
open reading frame
In molecular biology, open reading frames (ORFs) are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible readin ...
s, called ORF8a and ORF8b - a feature present in many
SARS-CoV
Severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1; or Severe acute respiratory syndrome coronavirus, SARS-CoV) is a strain of coronavirus that causes severe acute respiratory syndrome (SARS), the respiratory illness responsible for ...
viral isolates from later in the
SARS epidemic
Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the severe acute respiratory syndrome coronavirus (SARS-CoV or SARS-CoV-1), the first identified strain of the SARS coronavirus species, ''sev ...
, as well as in some
bat
Bats are mammals of the order Chiroptera.''cheir'', "hand" and πτερόν''pteron'', "wing". With their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more agile in flight than most ...
coronaviruses.
For this reason the full-length gene and its protein are sometimes called ORF8ab.
The full-length gene, exemplified in
SARS-CoV-2
Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19 (coronavirus disease 2019), the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus previously had a ...
, encodes a protein with an
immunoglobulin domain
The immunoglobulin domain, also known as the immunoglobulin fold, is a type of protein domain that consists of a 2-layer sandwich of 7-9 antiparallel β-strands arranged in two β-sheets with a Greek key topology, consisting of about 125 amino a ...
of unknown function, possibly involving interactions with the host
immune system
The immune system is a network of biological processes that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to parasitic worms, as well as cancer cells and objects such as wood splint ...
.
It is similar in structure to the
ORF7a protein, suggesting it may have originated through
gene duplication.
Structure
ORF8 in SARS-CoV-2 encodes a protein of 121
amino acid residues with an
N-terminal signal sequence.
ORF8 forms a
dimer
Dimer may refer to:
* Dimer (chemistry), a chemical structure formed from two similar sub-units
** Protein dimer, a protein quaternary structure
** d-dimer
* Dimer model, an item in statistical mechanics, based on ''domino tiling''
* Julius Dimer ...
that is
covalent
A covalent bond is a chemical bond that involves the sharing of electrons to form electron pairs between atoms. These electron pairs are known as shared pairs or bonding pairs. The stable balance of attractive and repulsive forces between atoms ...
ly linked by
disulfide bond
In biochemistry, a disulfide (or disulphide in British English) refers to a functional group with the structure . The linkage is also called an SS-bond or sometimes a disulfide bridge and is usually derived by the coupling of two thiol groups. In ...
s.
It has an
immunoglobulin-like domain with distant similarity to the
ORF7a protein.
Despite a similar overall fold, an
insertion in ORF8 likely is responsible for different
protein-protein interactions and creates an additional dimerization interface.
Unlike ORF7a, ORF8 lacks a
transmembrane helix and is therefore not a
transmembrane protein
A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequent ...
,
though it has been suggested it might have a membrane-anchored form.
ORF8 in SARS-CoV and SARS-CoV-2 are very divergent, with less than 20%
sequence identity
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Ali ...
.
The full-length ORF8 in SARS-CoV encodes a protein of 122 residues. In many SARS-CoV isolates it is split into ORF8a and ORF8b, separately expressing 39-residue ORF8a and 84-residue ORF8b proteins.
It has been suggested that the ORF8a and ORF8b proteins may form a
protein complex
A protein complex or multiprotein complex is a group of two or more associated polypeptide chains. Protein complexes are distinct from multienzyme complexes, in which multiple catalytic domains are found in a single polypeptide chain.
Protein ...
.
The
cysteine residue responsible for dimerization of the SARS-CoV-2 protein is not conserved in the SARS-CoV sequence.
The ORF8ab protein has also been reported to form disulfide-linked
multimer
In chemistry and biochemistry, an oligomer () is a molecule that consists of a few repeating units which could be derived, actually or conceptually, from smaller molecules, monomers.Quote: ''Oligomer molecule: A molecule of intermediate relat ...
s.
Post-translational modifications
The full-length SARS-CoV ORF8ab protein is
post-translationally modified
Post-translational modification (PTM) is the covalent and generally enzyme, enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by r ...
by
N-glycosylation
''N''-linked glycosylation, is the attachment of an oligosaccharide, a carbohydrate consisting of several sugar molecules, sometimes also referred to as glycan, to a nitrogen atom (the amide nitrogen of an asparagine (Asn) residue of a protein), ...
,
which is predicted to be conserved in the SARS-CoV-2 protein.
Under experimental conditions, both 8b and 8ab are
ubiquitin
Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Fo ...
ated.
Expression and localization
Along with the genes for other accessory proteins, the ORF8 gene is located near those encoding the structural proteins, at the
5' end of the coronavirus RNA genome. Along with
ORF6
ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus '' Sarbecovirus'', including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection t ...
,
ORF7a, and
ORF7b
ORF7b is a gene found in coronaviruses of the genus '' Betacoronavirus'', which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.
Structure
ORF7b protein ...
, ORF8 is located between the
membrane
A membrane is a selective barrier; it allows some things to pass through but stops others. Such things may be molecules, ions, or other small particles. Membranes can be generally classified into synthetic membranes and biological membranes. ...
(M) and
nucleocapsid (N) genes.
The SARS-CoV-2 ORF8 protein has a signal sequence for
trafficking
Smuggling is the illegal transportation of objects, substances, information or people, such as out of a house or buildings, into a prison, or across an international border, in violation of applicable laws or other regulations.
There are various ...
to the
endoplasmic reticulum (ER)
and has been experimentally
localized to the ER.
It is probably a
secreted protein
A secretory protein is any protein, whether it be endocrine or exocrine, which is secreted by a cell. Secretory proteins include many hormones, enzymes, toxins, and antimicrobial peptides.
Secretory proteins are synthesized in the endoplasmic ...
.
There are variable reports in the literature regarding the localization of SARS-CoV ORF8a, ORF8b, or ORF8ab proteins.
It is unclear if ORF8b is expressed at significant levels under natural conditions.
The full-length ORF8ab appears to localize to the ER.
Function
The function of the ORF8 protein is unknown. It is not
essential for
viral replication
Viral replication is the formation of biological viruses during the infection process in the target host cells. Viruses must first get into the cell before viral replication can occur. Through the generation of abundant copies of its genome an ...
in either SARS-CoV
or SARS-CoV-2,
though there is conflicting evidence on whether loss of ORF8 affects the efficiency of
viral replication
Viral replication is the formation of biological viruses during the infection process in the target host cells. Viruses must first get into the cell before viral replication can occur. Through the generation of abundant copies of its genome an ...
.
A function often suggested for ORF8 protein is interacting with the host
immune system
The immune system is a network of biological processes that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to parasitic worms, as well as cancer cells and objects such as wood splint ...
.
The SARS-CoV-2 protein is thought to have a role in
immunomodulation via
immune evasion or suppressing host immune responses.
It has been reported to be a
type I interferon antagonist and to downregulate
class I MHC.
The SARS-CoV-2 ORF8 protein is highly
immunogenic
Immunogenicity is the ability of a foreign substance, such as an antigen, to provoke an immune response in the body of a human or other animal. It may be wanted or unwanted:
* Wanted immunogenicity typically relates to vaccines, where the injectio ...
and high levels of
antibodies to the protein have been found in patients with or recovered from
COVID-19
Coronavirus disease 2019 (COVID-19) is a contagious disease caused by a virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was identified in Wuhan, China, in December 2019. The disease quickly ...
.
A recent study indicates that ORF8 is a
transcription inhibitor
Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called ...
.
Lisa Thomann, Volker Thiel. SARS-CoV-2 mimics a host protein to bypass defences. Nature 2022 Oct;610(7931):262-263 PMID 36198813
/ref>
It has been suggested that the SARS-CoV ORF8a protein assembles into multimers and forms a viroporin.
Evolution
The evolution
Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
ary history of ORF8 is complex. It is among the least conserved regions of the ''Sarbecovirus
''Severe acute respiratory syndrome–related coronavirus'' (SARSr-CoV or SARS-CoV)The terms ''SARSr-CoV'' and ''SARS-CoV'' are sometimes used interchangeably, especially prior to the discovery of SARS-CoV-2. This may cause confusion when some ...
'' genome. It is subject to frequent mutation
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA replication, DNA or viral repl ...
s and deletions, and has been described as "hypervariable" and a recombination hotspot Recombination hotspots are regions in a genome that exhibit elevated rates of recombination relative to a neutral expectation. The recombination rate within hotspots can be hundreds of times that of the surrounding region. Recombination hotspots re ...
. It has been suggested that RNA secondary structures in the region are associated with genomic instability
Genome instability (also genetic instability or genomic instability) refers to a high frequency of mutations within the genome of a cellular lineage. These mutations can include changes in nucleic acid sequences, chromosomal rearrangements or aneu ...
.
In SARS-CoV, the ORF8 region is thought to have originated through recombination among ancestral bat
Bats are mammals of the order Chiroptera.''cheir'', "hand" and πτερόν''pteron'', "wing". With their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more agile in flight than most ...
coronaviruses. Among the most distinctive features of this region in SARS-CoV is the emergence of a 29-nucleotide
Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecule ...
deletion that split the full-length open reading frame
In molecular biology, open reading frames (ORFs) are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible readin ...
into two smaller ORFs, ORF8a and ORF8b. Viral isolates from early in the SARS epidemic
Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the severe acute respiratory syndrome coronavirus (SARS-CoV or SARS-CoV-1), the first identified strain of the SARS coronavirus species, ''sev ...
have a full-length, intact ORF8, but the split structure emerged later in the epidemic. Similar split structures have since been observed in bat
Bats are mammals of the order Chiroptera.''cheir'', "hand" and πτερόν''pteron'', "wing". With their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more agile in flight than most ...
coronaviruses. Mutations and deletions have also been seen in SARS-CoV-2 variants. Based on observations in SARS-CoV, it has been suggested that changes in ORF8 may be related to host adaptation When considering pathogens, host adaptation can have varying descriptions. For example, in the case of ''Salmonella'', host adaptation is used to describe the "ability of a pathogen to circulate and cause disease in a particular host population." An ...
, but it is possible that ORF8 does not affect fitness in human hosts. In SARS-CoV, a high dN/dS ratio has been observed in ORF8, consistent with positive selection or with relaxed selection.
ORF8 encodes a protein whose immunoglobulin domain
The immunoglobulin domain, also known as the immunoglobulin fold, is a type of protein domain that consists of a 2-layer sandwich of 7-9 antiparallel β-strands arranged in two β-sheets with a Greek key topology, consisting of about 125 amino a ...
(Ig) has distant similarity to that of ORF7a. It has been suggested that ORF8 likely have evolved from ORF7a through gene duplication, though some bioinformatics analyses suggest the similarity may be too low to support duplication, which is relatively uncommon in viruses. Immunoglobulin domains are uncommon in coronaviruses; other than the subset of betacoronavirus
''Betacoronavirus'' (β-CoVs or Beta-CoVs) is one of four genera (''Alpha''-, ''Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals (of which humans are part). ...
es with ORF8 and ORF7a, only a small number of bat alphacoronavirus
Alphacoronaviruses (Alpha-CoV) are members of the first of the four genera (''Alpha''-, '' Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. They are positive-sense, single-stranded RNA viruses that infect mammals, including humans. They ...
es have been identified as containing likely Ig domains, while they are absent from gammacoronavirus
''Gammacoronavirus'' (Gamma-CoV) is one of the four genera (''Alpha''-, '' Beta-'', ''Gamma-'', and '' Delta-'') of coronaviruses. It is in the subfamily ''Orthocoronavirinae'' of the family ''Coronaviridae''. They are enveloped, positive-sens ...
es and deltacoronaviruses. ORF8 is notably absent in MERS-CoV
''Middle East respiratory syndrome–related coronavirus'' (''MERS-CoV''), or EMC/2012 ( HCoV-EMC/2012), is the virus that causes Middle East respiratory syndrome (MERS). It is a species of coronavirus which infects humans, bats, and camels. Th ...
. The beta and alpha Ig domains may be independent acquisitions, where ORF8 and ORF7a may have been acquired from host proteins. It is also possible that the absence of ORF8 reflects gene loss in those lineages.
References
{{Viral proteins
Coronavirus proteins