ORF8a
   HOME

TheInfoList



OR:

ORF8 is a
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
that encodes a
viral accessory protein A viral regulatory and accessory protein is a type of viral protein A viral protein is both a component and a product of a virus. Viral proteins are grouped according to their functions, and groups of viral proteins include structural proteins, n ...
,
Betacoronavirus ''Betacoronavirus'' (β-CoVs or Beta-CoVs) is one of four genera (''Alpha''-, ''Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals (of which humans are part). ...
NS8 protein, in
coronavirus Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the com ...
es of the
subgenus In biology, a subgenus (plural: subgenera) is a taxonomic rank directly below genus. In the International Code of Zoological Nomenclature, a subgeneric name can be used independently or included in a species name, in parentheses, placed between t ...
''
Sarbecovirus ''Severe acute respiratory syndrome–related coronavirus'' (SARSr-CoV or SARS-CoV)The terms ''SARSr-CoV'' and ''SARS-CoV'' are sometimes used interchangeably, especially prior to the discovery of SARS-CoV-2. This may cause confusion when some ...
''. It is one of the least well conserved and most variable parts of the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
. In some viruses, a deletion splits the region into two smaller
open reading frame In molecular biology, open reading frames (ORFs) are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible readin ...
s, called ORF8a and ORF8b - a feature present in many
SARS-CoV Severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1; or Severe acute respiratory syndrome coronavirus, SARS-CoV) is a strain of coronavirus that causes severe acute respiratory syndrome (SARS), the respiratory illness responsible for t ...
viral isolates from later in the
SARS epidemic Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the severe acute respiratory syndrome coronavirus (SARS-CoV or SARS-CoV-1), the first identified strain of the SARS coronavirus species, ''sever ...
, as well as in some
bat Bats are mammals of the order Chiroptera.''cheir'', "hand" and πτερόν''pteron'', "wing". With their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more agile in flight than most bi ...
coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in
SARS-CoV-2 Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19 (coronavirus disease 2019), the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus previously had a ...
, encodes a protein with an
immunoglobulin domain The immunoglobulin domain, also known as the immunoglobulin fold, is a type of protein domain that consists of a 2-layer sandwich of 7-9 antiparallel β-strands arranged in two β-sheets with a Greek key topology, consisting of about 125 amino ac ...
of unknown function, possibly involving interactions with the host
immune system The immune system is a network of biological processes that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to parasitic worms, as well as cancer cells and objects such as wood splinte ...
. It is similar in structure to the
ORF7a ORF7a (also known by several other names, including SARS coronavirus X4, SARS-X4, ORF7a, or U122) is a gene found in coronaviruses of the ''Betacoronavirus'' genus. It gene expression, expresses the Betacoronavirus NS7A protein, a type I transmem ...
protein, suggesting it may have originated through
gene duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. ...
.


Structure

ORF8 in SARS-CoV-2 encodes a protein of 121
amino acid residue Protein structure is the molecular geometry, three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single ami ...
s with an
N-terminal The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
signal sequence. ORF8 forms a
dimer Dimer may refer to: * Dimer (chemistry), a chemical structure formed from two similar sub-units ** Protein dimer, a protein quaternary structure ** d-dimer * Dimer model, an item in statistical mechanics, based on ''domino tiling'' * Julius Dimer ...
that is
covalent A covalent bond is a chemical bond that involves the sharing of electrons to form electron pairs between atoms. These electron pairs are known as shared pairs or bonding pairs. The stable balance of attractive and repulsive forces between atoms ...
ly linked by
disulfide bond In biochemistry, a disulfide (or disulphide in British English) refers to a functional group with the structure . The linkage is also called an SS-bond or sometimes a disulfide bridge and is usually derived by the coupling of two thiol groups. In ...
s. It has an immunoglobulin-like
domain Domain may refer to: Mathematics *Domain of a function, the set of input values for which the (total) function is defined **Domain of definition of a partial function **Natural domain of a partial function **Domain of holomorphy of a function * Do ...
with distant similarity to the
ORF7a ORF7a (also known by several other names, including SARS coronavirus X4, SARS-X4, ORF7a, or U122) is a gene found in coronaviruses of the ''Betacoronavirus'' genus. It gene expression, expresses the Betacoronavirus NS7A protein, a type I transmem ...
protein. Despite a similar overall fold, an insertion in ORF8 likely is responsible for different protein-protein interactions and creates an additional dimerization interface. Unlike ORF7a, ORF8 lacks a
transmembrane helix A transmembrane domain (TMD) is a membrane-spanning protein domain. TMDs generally adopt an alpha helix topological conformation, although some TMDs such as those in porins can adopt a different conformation. Because the interior of the lipid bi ...
and is therefore not a
transmembrane protein A transmembrane protein (TP) is a type of integral membrane protein that spans the entirety of the cell membrane. Many transmembrane proteins function as gateways to permit the transport of specific substances across the membrane. They frequentl ...
, though it has been suggested it might have a membrane-anchored form. ORF8 in SARS-CoV and SARS-CoV-2 are very divergent, with less than 20%
sequence identity In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Ali ...
. The full-length ORF8 in SARS-CoV encodes a protein of 122 residues. In many SARS-CoV isolates it is split into ORF8a and ORF8b, separately expressing 39-residue ORF8a and 84-residue ORF8b proteins. It has been suggested that the ORF8a and ORF8b proteins may form a
protein complex A protein complex or multiprotein complex is a group of two or more associated polypeptide chains. Protein complexes are distinct from multienzyme complexes, in which multiple catalytic domains are found in a single polypeptide chain. Protein c ...
. The
cysteine Cysteine (symbol Cys or C; ) is a semiessential proteinogenic amino acid with the formula . The thiol side chain in cysteine often participates in enzymatic reactions as a nucleophile. When present as a deprotonated catalytic residue, sometime ...
residue responsible for dimerization of the SARS-CoV-2 protein is not conserved in the SARS-CoV sequence. The ORF8ab protein has also been reported to form disulfide-linked
multimer In chemistry and biochemistry, an oligomer () is a molecule that consists of a few repeating units which could be derived, actually or conceptually, from smaller molecules, monomers.Quote: ''Oligomer molecule: A molecule of intermediate relat ...
s.


Post-translational modifications

The full-length SARS-CoV ORF8ab protein is
post-translationally modified Post-translational modification (PTM) is the covalent and generally enzyme, enzymatic modification of proteins following protein biosynthesis. This process occurs in the endoplasmic reticulum and the golgi apparatus. Proteins are synthesized by r ...
by
N-glycosylation ''N''-linked glycosylation, is the attachment of an oligosaccharide, a carbohydrate consisting of several sugar molecules, sometimes also referred to as glycan, to a nitrogen atom (the amide nitrogen of an asparagine (Asn) residue of a protein), ...
, which is predicted to be conserved in the SARS-CoV-2 protein. Under experimental conditions, both 8b and 8ab are
ubiquitin Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Fo ...
ated.


Expression and localization

Along with the genes for other accessory proteins, the ORF8 gene is located near those encoding the structural proteins, at the 5' end of the coronavirus RNA genome. Along with
ORF6 ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus '' Sarbecovirus'', including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection t ...
,
ORF7a ORF7a (also known by several other names, including SARS coronavirus X4, SARS-X4, ORF7a, or U122) is a gene found in coronaviruses of the ''Betacoronavirus'' genus. It gene expression, expresses the Betacoronavirus NS7A protein, a type I transmem ...
, and
ORF7b ORF7b is a gene found in coronaviruses of the genus '' Betacoronavirus'', which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function. Structure ORF7b protein ...
, ORF8 is located between the
membrane A membrane is a selective barrier; it allows some things to pass through but stops others. Such things may be molecules, ions, or other small particles. Membranes can be generally classified into synthetic membranes and biological membranes. B ...
(M) and
nucleocapsid A capsid is the protein shell of a virus, enclosing its genetic material. It consists of several oligomeric (repeating) structural subunits made of protein called protomers. The observable 3-dimensional morphological subunits, which may or may ...
(N) genes. The SARS-CoV-2 ORF8 protein has a signal sequence for
trafficking Smuggling is the illegal transportation of objects, substances, information or people, such as out of a house or buildings, into a prison, or across an international border, in violation of applicable laws or other regulations. There are various ...
to the
endoplasmic reticulum The endoplasmic reticulum (ER) is, in essence, the transportation system of the eukaryotic cell, and has many other important functions such as protein folding. It is a type of organelle made up of two subunits – rough endoplasmic reticulum ( ...
(ER) and has been experimentally localized to the ER. It is probably a
secreted protein A secretory protein is any protein, whether it be endocrine or exocrine, which is secreted by a cell. Secretory proteins include many hormones, enzymes, toxins, and antimicrobial peptides. Secretory proteins are synthesized in the endoplasmic ret ...
. There are variable reports in the literature regarding the localization of SARS-CoV ORF8a, ORF8b, or ORF8ab proteins. It is unclear if ORF8b is expressed at significant levels under natural conditions. The full-length ORF8ab appears to localize to the ER.


Function

The function of the ORF8 protein is unknown. It is not essential for
viral replication Viral replication is the formation of biological viruses during the infection process in the target host cells. Viruses must first get into the cell before viral replication can occur. Through the generation of abundant copies of its genome an ...
in either SARS-CoV or SARS-CoV-2, though there is conflicting evidence on whether loss of ORF8 affects the efficiency of
viral replication Viral replication is the formation of biological viruses during the infection process in the target host cells. Viruses must first get into the cell before viral replication can occur. Through the generation of abundant copies of its genome an ...
. A function often suggested for ORF8 protein is interacting with the host
immune system The immune system is a network of biological processes that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to parasitic worms, as well as cancer cells and objects such as wood splinte ...
. The SARS-CoV-2 protein is thought to have a role in immunomodulation via immune evasion or suppressing host immune responses. It has been reported to be a
type I interferon The type-I interferons (IFN) are cytokines which play essential roles in inflammation, immunoregulation, tumor cells recognition, and T cell, T-cell responses. In the human genome, a cluster of thirteen functional IFN genes is located at the 9p2 ...
antagonist and to downregulate
class I MHC MHC class I molecules are one of two primary classes of major histocompatibility complex (MHC) molecules (the other being MHC class II) and are found on the cell surface of all nucleated cells in the bodies of vertebrates. They also occur on plat ...
. The SARS-CoV-2 ORF8 protein is highly
immunogenic Immunogenicity is the ability of a foreign substance, such as an antigen, to provoke an immune response in the body of a human or other animal. It may be wanted or unwanted: * Wanted immunogenicity typically relates to vaccines, where the injectio ...
and high levels of
antibodies An antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the ...
to the protein have been found in patients with or recovered from
COVID-19 Coronavirus disease 2019 (COVID-19) is a contagious disease caused by a virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was COVID-19 pandemic in Hubei, identified in Wuhan, China, in December ...
. A recent study indicates that ORF8 is a
transcription inhibitor Transcription is the process of copying a segment of DNA into RNA. The segments of DNA transcribed into RNA molecules that can encode proteins are said to produce messenger RNA (mRNA). Other segments of DNA are copied into RNA molecules called ...
.Lisa Thomann, Volker Thiel. SARS-CoV-2 mimics a host protein to bypass defences. Nature 2022 Oct;610(7931):262-263 PMID 36198813
/ref> It has been suggested that the SARS-CoV ORF8a protein assembles into multimers and forms a
viroporin Viroporins are small and usually hydrophobic multifunctional viral proteins that modify cellular membranes, thereby facilitating virus release from infected cells. Viroporins are capable of assembling into oligomeric ion channels or pores in the h ...
.


Evolution

The
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
ary history of ORF8 is complex. It is among the least conserved regions of the ''
Sarbecovirus ''Severe acute respiratory syndrome–related coronavirus'' (SARSr-CoV or SARS-CoV)The terms ''SARSr-CoV'' and ''SARS-CoV'' are sometimes used interchangeably, especially prior to the discovery of SARS-CoV-2. This may cause confusion when some ...
'' genome. It is subject to frequent
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
s and deletions, and has been described as "hypervariable" and a
recombination hotspot Recombination hotspots are regions in a genome that exhibit elevated rates of recombination relative to a neutral expectation. The recombination rate within hotspots can be hundreds of times that of the surrounding region. Recombination hotspots re ...
. It has been suggested that RNA secondary structures in the region are associated with
genomic instability Genome instability (also genetic instability or genomic instability) refers to a high frequency of mutations within the genome of a cellular lineage. These mutations can include changes in nucleic acid sequences, chromosomal rearrangements or aneu ...
. In SARS-CoV, the ORF8 region is thought to have originated through recombination among ancestral
bat Bats are mammals of the order Chiroptera.''cheir'', "hand" and πτερόν''pteron'', "wing". With their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more agile in flight than most bi ...
coronaviruses. Among the most distinctive features of this region in SARS-CoV is the emergence of a 29-
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecules wi ...
deletion that split the full-length
open reading frame In molecular biology, open reading frames (ORFs) are defined as spans of DNA sequence between the start and stop codons. Usually, this is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible readin ...
into two smaller ORFs, ORF8a and ORF8b. Viral isolates from early in the
SARS epidemic Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the severe acute respiratory syndrome coronavirus (SARS-CoV or SARS-CoV-1), the first identified strain of the SARS coronavirus species, ''sever ...
have a full-length, intact ORF8, but the split structure emerged later in the epidemic. Similar split structures have since been observed in
bat Bats are mammals of the order Chiroptera.''cheir'', "hand" and πτερόν''pteron'', "wing". With their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more agile in flight than most bi ...
coronaviruses. Mutations and deletions have also been seen in
SARS-CoV-2 variant There are many variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes coronavirus disease 2019 (COVID-19). Some are believed, or have been stated, to be of particular importance due to their potenti ...
s. Based on observations in SARS-CoV, it has been suggested that changes in ORF8 may be related to
host adaptation When considering pathogens, host adaptation can have varying descriptions. For example, in the case of ''Salmonella'', host adaptation is used to describe the "ability of a pathogen to circulate and cause disease in a particular host population." An ...
, but it is possible that ORF8 does not affect fitness in human hosts. In SARS-CoV, a high dN/dS ratio has been observed in ORF8, consistent with
positive selection In population genetics, directional selection, is a mode of negative natural selection in which an extreme phenotype is favored over other phenotypes, causing the allele frequency to shift over time in the direction of that phenotype. Under dir ...
or with relaxed selection. ORF8 encodes a protein whose
immunoglobulin domain The immunoglobulin domain, also known as the immunoglobulin fold, is a type of protein domain that consists of a 2-layer sandwich of 7-9 antiparallel β-strands arranged in two β-sheets with a Greek key topology, consisting of about 125 amino ac ...
(Ig) has distant similarity to that of
ORF7a ORF7a (also known by several other names, including SARS coronavirus X4, SARS-X4, ORF7a, or U122) is a gene found in coronaviruses of the ''Betacoronavirus'' genus. It gene expression, expresses the Betacoronavirus NS7A protein, a type I transmem ...
. It has been suggested that ORF8 likely have evolved from ORF7a through
gene duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. ...
, though some
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
analyses suggest the similarity may be too low to support duplication, which is relatively uncommon in viruses. Immunoglobulin domains are uncommon in coronaviruses; other than the subset of
betacoronavirus ''Betacoronavirus'' (β-CoVs or Beta-CoVs) is one of four genera (''Alpha''-, ''Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals (of which humans are part). ...
es with ORF8 and ORF7a, only a small number of bat
alphacoronavirus Alphacoronaviruses (Alpha-CoV) are members of the first of the four genera (''Alpha''-, '' Beta-'', '' Gamma-'', and '' Delta-'') of coronaviruses. They are positive-sense, single-stranded RNA viruses that infect mammals, including humans. They ...
es have been identified as containing likely Ig domains, while they are absent from
gammacoronavirus ''Gammacoronavirus'' (Gamma-CoV) is one of the four genera (''Alpha''-, '' Beta-'', ''Gamma-'', and '' Delta-'') of coronaviruses. It is in the subfamily ''Orthocoronavirinae'' of the family ''Coronaviridae''. They are enveloped, positive-sens ...
es and deltacoronaviruses. ORF8 is notably absent in
MERS-CoV ''Middle East respiratory syndrome–related coronavirus'' (''MERS-CoV''), or EMC/2012 ( HCoV-EMC/2012), is the virus that causes Middle East respiratory syndrome (MERS). It is a species of coronavirus which infects humans, bats, and camels. The ...
. The beta and alpha Ig domains may be independent acquisitions, where ORF8 and ORF7a may have been acquired from host proteins. It is also possible that the absence of ORF8 reflects gene loss in those lineages.


References

{{Viral proteins Coronavirus proteins