Protein Production (biotechnology)
   HOME

TheInfoList



OR:

Protein production is the biotechnological process of generating a specific protein. It is typically achieved by the manipulation of
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
in an organism such that it expresses large amounts of a
recombinant gene Recombinant DNA (rDNA) molecules are DNA molecules formed by laboratory methods of genetic recombination (such as molecular cloning) that bring together genetic material from multiple sources, creating sequences that would not otherwise be fou ...
. This includes the transcription of the recombinant DNA to messenger
RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
( mRNA), the translation of mRNA into
polypeptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. A p ...
chains, which are ultimately folded into functional
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
and may be targeted to specific subcellular or extracellular locations. Protein production systems (also known as
expression system Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
s) are used in the
life sciences This list of life sciences comprises the branches of science that involve the scientific study of life – such as microorganisms, plants, and animals including human beings. This science is one of the two major branches of natural science, the ...
, biotechnology, and medicine. Molecular biology research uses numerous proteins and enzymes, many of which are from expression systems; particularly DNA polymerase for
PCR PCR or pcr may refer to: Science * Phosphocreatine, a phosphorylated creatine molecule * Principal component regression, a statistical technique Medicine * Polymerase chain reaction ** COVID-19 testing, often performed using the polymerase chain r ...
,
reverse transcriptase A reverse transcriptase (RT) is an enzyme used to generate complementary DNA (cDNA) from an RNA template, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, ...
for RNA analysis, restriction endonucleases for cloning, and to make proteins that are screened in
drug discovery In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered. Historically, drugs were discovered by identifying the active ingredient from traditional remedies or by ...
as biological targets or as potential drugs themselves. There are also significant applications for expression systems in industrial fermentation, notably the production of biopharmaceuticals such as human
insulin Insulin (, from Latin ''insula'', 'island') is a peptide hormone produced by beta cells of the pancreatic islets encoded in humans by the ''INS'' gene. It is considered to be the main anabolic hormone of the body. It regulates the metabolism o ...
to treat diabetes, and to manufacture
enzymes Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrate (chemistry), substrates, and the enzyme converts the substrates into different molecule ...
.


Protein production systems

Commonly used protein production systems include those derived from bacteria, yeast, baculovirus/ insect, mammalian cells, and more recently filamentous fungi such as ''
Myceliophthora thermophila ''Myceliophthora thermophila'' is an ascomycete fungus that grows optimally at . It efficiently degrades cellulose and is of interest in the production of biofuels. The genome has recently been sequenced, revealing the full range of enzymes th ...
''. When biopharmaceuticals are produced with one of these systems, process-related impurities termed
host cell protein Host cell proteins (HCPs) are process-related protein impurities that are produced by the host organism during biotherapeutic manufacturing and production. During the purification process, a majority of produced HCPs are removed from the final pro ...
s also arrive in the final product in trace amounts.


Cell-based systems

The oldest and most widely used expression systems are cell-based and may be defined as the "''combination of an expression vector, its cloned DNA, and the host for the vector that provide a context to allow foreign gene function in a host cell, that is, produce proteins at a high level''". Overexpression is an abnormally and excessively high level of
gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
which produces a pronounced gene-related phenotype. There are many ways to introduce foreign DNA to a cell for expression, and many different host cells may be used for expression — each expression system has distinct advantages and liabilities. Expression systems are normally referred to by the
host A host is a person responsible for guests at an event or for providing hospitality during it. Host may also refer to: Places * Host, Pennsylvania, a village in Berks County People *Jim Host (born 1937), American businessman * Michel Host ...
and the DNA source or the delivery mechanism for the genetic material. For example, common hosts are bacteria (such as '' E.coli'', '' B. subtilis''), yeast (such as ''
S.cerevisiae ''Saccharomyces cerevisiae'' () (brewer's yeast or baker's yeast) is a species of yeast (single-celled fungus microorganisms). The species has been instrumental in winemaking, baking, and brewing since ancient times. It is believed to have been ...
'') or eukaryotic cell lines. Common DNA sources and delivery mechanisms are viruses (such as baculovirus,
retrovirus A retrovirus is a type of virus that inserts a DNA copy of its RNA genome into the DNA of a host cell that it invades, thus changing the genome of that cell. Once inside the host cell's cytoplasm, the virus uses its own reverse transcriptase ...
,
adenovirus Adenoviruses (members of the family ''Adenoviridae'') are medium-sized (90–100 nm), nonenveloped (without an outer lipid bilayer) viruses with an icosahedral nucleocapsid containing a double-stranded DNA genome. Their name derives from the ...
),
plasmid A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria; how ...
s, artificial chromosomes and
bacteriophage A bacteriophage (), also known informally as a ''phage'' (), is a duplodnaviria virus that infects and replicates within bacteria and archaea. The term was derived from "bacteria" and the Greek φαγεῖν ('), meaning "to devour". Bacteri ...
(such as lambda). The best expression system depends on the gene involved, for example the '' Saccharomyces cerevisiae'' is often preferred for proteins that require significant posttranslational modification. Insect or
mammal Mammals () are a group of vertebrate animals constituting the class Mammalia (), characterized by the presence of mammary glands which in females produce milk for feeding (nursing) their young, a neocortex (a region of the brain), fur or ...
cell lines are used when human-like splicing of mRNA is required. Nonetheless, bacterial expression has the advantage of easily producing large amounts of protein, which is required for X-ray crystallography or nuclear magnetic resonance experiments for structure determination. Because bacteria are prokaryotes, they are not equipped with the full enzymatic machinery to accomplish the required post-translational modifications or molecular folding. Hence, multi-domain eukaryotic proteins expressed in bacteria often are non-functional. Also, many proteins become insoluble as inclusion bodies that are difficult to recover without harsh denaturants and subsequent cumbersome protein-refolding. To address these concerns, expressions systems using multiple eukaryotic cells were developed for applications requiring the proteins be conformed as in, or closer to eukaryotic organisms: cells of plants (i.e. tobacco), of insects or mammalians (i.e. bovines) are transfected with genes and cultured in suspension and even as tissues or whole organisms, to produce fully folded proteins. Mammalian '' in vivo'' expression systems have however low yield and other limitations (time-consuming, toxicity to host cells,..). To combine the high yield/productivity and scalable protein features of bacteria and yeast, and advanced epigenetic features of plants, insects and mammalians systems, other protein production systems are developed using unicellular eukaryotes (i.e. non-pathogenic '''
Leishmania ''Leishmania'' is a parasitic protozoan, a single-celled organism of the genus '' Leishmania'' that are responsible for the disease leishmaniasis. They are spread by sandflies of the genus ''Phlebotomus'' in the Old World, and of the genus '' ...
''' cells).


Bacterial systems


= ''Escherichia coli''

= ''
E. coli ''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus ''Escher ...
'' is one of the most widely used expression hosts, and DNA is normally introduced in a
plasmid A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria; how ...
expression vector. The techniques for overexpression in ''E. coli'' are well developed and work by increasing the number of copies of the gene or increasing the binding strength of the promoter region so assisting transcription. For example, a DNA sequence for a protein of interest could be cloned or
subcloned In molecular biology, subcloning is a technique used to move a particular DNA sequence from a ''parent vector'' to a ''destination vector''. Subcloning is not to be confused with molecular cloning, a related technique. Procedure Restriction e ...
into a high copy-number plasmid containing the '' lac'' (often LacUV5) promoter, which is then transformed into the bacterium ''E. coli''. Addition of IPTG (a
lactose Lactose is a disaccharide sugar synthesized by galactose and glucose subunits and has the molecular formula C12H22O11. Lactose makes up around 2–8% of milk (by mass). The name comes from ' (gen. '), the Latin word for milk, plus the suffix '' - ...
analog) activates the lac promoter and causes the bacteria to express the protein of interest. ''E. coli'' strain BL21 and BL21(DE3) are two strains commonly used for protein production. As members of the B lineage, they lack ''
lon Lon or LON may refer to: People * Lon (photographer), pseudonym of Alonzo Hanagan, also known as "Lon of New York" * Lon (name), a list of people with the given name, nickname or surname Fictional characters * Lon Cohen, a character in the Ne ...
'' and '' OmpT'' proteases, protecting the produced proteins from degradation. The DE3 prophage found in BL21(DE3) provides T7 RNA polymerase (driven by the LacUV5 promoter), allowing for vectors with the T7 promoter to be used instead.


= ''Corynebacterium''

= Non-pathogenic species of the gram-positive ''
Corynebacterium ''Corynebacterium'' () is a genus of Gram-positive bacteria and most are aerobe, aerobic. They are bacillus (shape), bacilli (rod-shaped), and in some phases of life they are, more specifically, club (weapon), club-shaped, which inspired the gen ...
'' are used for the commercial production of various amino acids. The '' C. glutamicum'' species is widely used for producing
glutamate Glutamic acid (symbol Glu or E; the ionic form is known as glutamate) is an α-amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a non-essential nutrient for humans, meaning that the human body can syn ...
and
lysine Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. It contains an α-amino group (which is in the protonated form under biological conditions), an α-carboxylic acid group (which is in the deprotonated −C ...
, components of human food, animal feed and pharmaceutical products. Expression of functionally active human
epidermal growth factor Epidermal growth factor (EGF) is a protein that stimulates cell growth and differentiation by binding to its receptor, EGFR. Human EGF is 6-k Da and has 53 amino acid residues and three intramolecular disulfide bonds. EGF was originally descr ...
has been done in ''C. glutamicum'', thus demonstrating a potential for industrial-scale production of human proteins. Expressed proteins can be targeted for secretion through either the general, secretory pathway (Sec) or the twin-arginine translocation pathway (Tat). Unlike gram-negative bacteria, the gram-positive ''Corynebacterium'' lack lipopolysaccharides that function as antigenic endotoxins in humans.


= ''Pseudomonas fluorescens''

= The non-pathogenic and gram-negative bacteria, ''
Pseudomonas fluorescens ''Pseudomonas fluorescens'' is a common Gram-negative, rod-shaped bacterium. It belongs to the ''Pseudomonas'' genus; 16S rRNA analysis as well as phylogenomic analysis has placed ''P. fluorescens'' in the ''P. fluorescens'' group within the genu ...
'', is used for high level production of recombinant proteins; commonly for the development bio-therapeutics and vaccines. '' P. fluorescens'' is a metabolically versatile organism, allowing for high throughput screening and rapid development of complex proteins. ''P. fluorescens'' is most well known for its ability to rapid and successfully produce high titers of active, soluble protein.


Eukaryotic systems


= Yeasts

= Expression systems using either '' S. cerevisiae'' or '' Pichia pastoris'' allow stable and lasting production of proteins that are processed similarly to mammalian cells, at high yield, in chemically defined media of proteins.


= Filamentous fungi

= Filamentous fungi, especially '' Aspergillus'' and ''
Trichoderma ''Trichoderma'' is a genus of fungi in the family Hypocreaceae that is present in all soils, where they are the most prevalent culturable fungi. Many species in this genus can be characterized as opportunistic avirulent plant symbionts. This ...
'', but also more recently ''
Myceliophthora thermophila ''Myceliophthora thermophila'' is an ascomycete fungus that grows optimally at . It efficiently degrades cellulose and is of interest in the production of biofuels. The genome has recently been sequenced, revealing the full range of enzymes th ...
'' C1 have been developed into expression platforms for screening and production of diverse
industrial enzymes Industrial enzymes are enzymes that are commercially used in a variety of industries such as pharmaceuticals, chemical production, biofuels, food & beverage, and consumer products. Due to advancements in recent years, biocatalysis through isolat ...
. The expression system C1 shows a low viscosity morphology in submerged culture, enabling the use of complex growth and production media.


= ''Baculovirus''-infected cells

= Baculovirus-infected insect cells ( Sf9, Sf21,
High Five High five is a friendly gesture in which one individual slaps another's hand. High five (and variants such as Hi5, Hi-5, and Hi-Five) may also refer to: Music * Hi-5 (Australian group), an Australian children's musical group * Hi-5 (Greek band), ...
strains) or mammalian cells ( HeLa, HEK 293) allow production of glycosylated or membrane proteins that cannot be produced using fungal or bacterial systems. It is useful for production of proteins in high quantity. Genes are not expressed continuously because infected host cells eventually lyse and die during each infection cycle.


= Non-lytic insect cell expression

= Non-lytic insect cell expression is an alternative to the lytic baculovirus expression system. In non-lytic expression, vectors are transiently or stably transfected into the chromosomal DNA of insect cells for subsequent gene expression. This is followed by selection and screening of recombinant clones. The non-lytic system has been used to give higher protein yield and quicker expression of recombinant genes compared to baculovirus-infected cell expression. Cell lines used for this system include: Sf9, Sf21 from '' Spodoptera frugiperda'' cells, Hi-5 from '' Trichoplusia ni'' cells, and Schneider 2 cells and Schneider 3 cells from '' Drosophila melanogaster'' cells. With this system, cells do not lyse and several cultivation modes can be used. Additionally, protein production runs are reproducible. This system gives a homogeneous product. A drawback of this system is the requirement of an additional screening step for selecting viable clones.


= '' Excavata''

= ''
Leishmania ''Leishmania'' is a parasitic protozoan, a single-celled organism of the genus '' Leishmania'' that are responsible for the disease leishmaniasis. They are spread by sandflies of the genus ''Phlebotomus'' in the Old World, and of the genus '' ...
tarentolae'' (cannot infect mammals) expression systems allow stable and lasting production of proteins at high yield, in chemically defined media. Produced proteins exhibit fully eukaryotic post-translational modifications, including
glycosylation Glycosylation is the reaction in which a carbohydrate (or ' glycan'), i.e. a glycosyl donor, is attached to a hydroxyl or other functional group of another molecule (a glycosyl acceptor) in order to form a glycoconjugate. In biology (but not al ...
and disulfide bond formation.


= Mammalian systems

= The most common mammalian expression systems are
Chinese Hamster The Chinese hamster (''Cricetulus griseus'' or ''Cricetulus barabensis griseus'') is a rodent in the genus '' Cricetulus'' of the subfamily Cricetidae that originated in the deserts of northern China and Mongolia. They are distinguished by an unco ...
ovary The ovary is an organ in the female reproductive system that produces an ovum. When released, this travels down the fallopian tube into the uterus, where it may become fertilized by a sperm. There is an ovary () found on each side of the body. ...
(CHO) and Human embryonic kidney (HEK) cells. * Chinese hamster ovary cell *
Mouse A mouse ( : mice) is a small rodent. Characteristically, mice are known to have a pointed snout, small rounded ears, a body-length scaly tail, and a high breeding rate. The best known mouse species is the common house mouse (''Mus musculus' ...
myeloma lymphoblstoid (e.g. NS0 cell) * Fully Human ** Human embryonic kidney cells ( HEK-293) ** Human embryonic retinal cells (Crucell's Per.C6) ** Human
amniocyte An amniocyte (literally "lamb cell") is a cell of a fetus which is suspended in the amniotic fluid The amniotic fluid is the protective liquid contained by the amniotic sac of a gravid amniote. This fluid serves as a cushion for the growing fet ...
cells (Glycotope and CEVEC)


Cell-free systems

Cell-free production of proteins is performed ''in vitro'' using purified RNA polymerase, ribosomes, tRNA and ribonucleotides. These reagents may be produced by extraction from cells or from a cell-based expression system. Due to the low expression levels and high cost of cell-free systems, cell-based systems are more widely used.


See also

*
Cellosaurus Cellosaurus is an online knowledge base on cell lines, which attempts to document all cell lines used in biomedical research. It is provided by the Swiss Institute of Bioinformatics (SIB). It is an ELIXIR Core Data Resource as well as an IR ...
, a database of cell lines *
Gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. The ...
*
Single-cell protein Single-cell proteins (SCP) or microbial proteins refer to edible unicellular microorganisms. The biomass or protein extract from pure or mixed cultures of algae, yeasts, fungi or bacteria may be used as an ingredient or a substitute for protein-ric ...
* Protein purification * Precision fermentation *
Host cell protein Host cell proteins (HCPs) are process-related protein impurities that are produced by the host organism during biotherapeutic manufacturing and production. During the purification process, a majority of produced HCPs are removed from the final pro ...
* List of recombinant proteins


References


Further reading

* *


External links

{{Microorganisms Gene expression Biotechnology