Infologs
   HOME

TheInfoList



OR:

Infologs are independently designed synthetic genes derived from one or a few genes where substitutions are systematically incorporated to maximize information. Infologs are designed for perfect diversity distribution to maximize search efficiency. Typical
protein engineering Protein engineering is the process of developing useful or valuable proteins. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles. It has been used to imp ...
methods rely on screening a high number (106-1012 or more) of gene variants to identify individuals with improved activity using a surrogate high throughput screen (HTP) to identify initial hits. Unfortunately, results are defined by what is screened for, thus the “hit” from the HTP screen often has very little real activity in a lower throughput assay more indicative of the improved functionality for which the protein is being developed. By adapting the standard algorithms for engineering complex systems to work with biological systems, the resulting process enables researchers to deconvolute how substitutions within a protein sequence modify its function. Combining these algorithms with an integrated query and ranking mechanism allows the identification of appropriate sequence substitutions. Infologs refers to the set of designed genes, singular use Infolog describes an individual variant.


Ancestry

Homology Homology may refer to: Sciences Biology *Homology (biology), any characteristic of biological organisms that is derived from a common ancestor * Sequence homology, biological homology between DNA, RNA, or protein sequences *Homologous chrom ...
between protein or DNA sequences is defined in terms of shared ancestry. Two segments of DNA can have shared ancestry because of either a speciation event (orthologs) or a duplication event (paralogs). Homologs are similar genes and/or proteins which are related by ancestry. Orthologs are the 'same' gene, but from different organisms. Homologous sequences are orthologous if they were separated by a speciation event: when a species diverges into two separate species, the copies of a single gene in the two resulting species are said to be orthologous. Orthologs, or orthologous genes, are genes in different species that originated by vertical descent from a single gene of the last common ancestor. The term "ortholog" was coined in 1970 by Walter Fitch. Paralogs are related genes originating from one gene that through duplication ended up as two genes that over time has evolved for two separate functions (or, according to a recent Science paper, a promiscuous starting gene that duplicated and each copy evolved towards different functions). Paralogs typically have the same or similar function, but sometimes do not: due to lack of the original selective pressure upon one copy of the duplicated gene, this copy is free to mutate and acquire new functions. Paralogs usually occur from within the same species. Xenologs are homologs resulting from horizontal gene transfer between two organisms. Xenologs can have different functions, if the new environment is vastly different for the horizontally moving gene. In general, though, xenologs typically have similar function in both organisms. Infologs are similar genes and/or proteins which are related by synthetic ancestry to approach perfect diversity distribution.


Features

* Optimize directly for function in the final application * Does not require high-throughput (HTP) screens * Screen small numbers of variants (50-200) directly for the desired function * Decreased false positives: variants identified by HTP screens that do not retain activity in 'real' assay * Decreased loss of potential positive hits due to screening error or poor correlation between HTP screen and 'real' assay * No biodiversity collections required, everything is synthesized as needed * Sequence-function relationships provide the basis for strong composition-of-matter patent claims.


Case study

Transforming Protein engineering with Infologs: Using independently designed synthetic genes where substitutions are systematically incorporated (Infologs) leads to uniform sampling, systematic variance and unrestricted information rich results. Wheat
Glutathione S-transferase Glutathione ''S''-transferases (GSTs), previously known as ligandins, are a family of eukaryotic and prokaryotic phase II metabolic isozymes best known for their ability to catalyze the conjugation of the reduced form of glutathione (GSH) to x ...
s (GST) with the ability to detoxify a panel of common herbicides was designed using this patented bioengineering method. The relative functional contribution of 60 amino acid substitutions against 14 herbicides was quantified using only 96 Infologs and dramatically improved by a small set (16) of 2nd generation Infologs. In addition, highly predictable GST sequence-function models against two commercially relevant herbicides were created with quantification of relative functional contribution of 60 amino acid substitutions in two dimensions.Enzyme Engineering Conference Presentation: "Using Infologs to Engineer Biological Systems"
/ref>


Rational design of proteins

In rational protein design, the scientist uses detailed knowledge of the structure and function of the protein to make desired changes. This generally has the advantage of being technically easy and inexpensive, since site-directed mutagenesis techniques are well-developed. However, its major drawback is that detailed structural knowledge of a protein is often unavailable, and even when it is available, it can be extremely difficult to predict the effects of various mutations. Computational protein design
algorithms In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing c ...
seek to identify novel amino acid sequences that are low in energy when folded to the pre-specified target structure. While the sequence-conformation space that needs to be searched is large, the most challenging requirement for computational protein design is a fast, yet accurate, energy function that can distinguish optimal sequences from similar suboptimal ones.


See also

*
OrthoDB OrthoDB presents a catalog of orthologous protein-coding genes across vertebrates, arthropods, fungi, plants, and bacteria. Orthology refers to the last common ancestor of the species under consideration, and thus OrthoDB explicitly delineates or ...
*
Protein folding Protein folding is the physical process by which a protein chain is translated to its native three-dimensional structure, typically a "folded" conformation by which the protein becomes biologically functional. Via an expeditious and reproduci ...
* Synthetic biology


References


Further reading


GRC Biocatalysis, 2014: Systematic Exploration of Sequence Space for Protein Engineering Poster
* * * * * * *{{cite journal , pmid=12943844 , year=2003 , last1=Gustafsson , first1=Claes , last2=Govindarajan , first2=Sridhar , last3=Minshull , first3=Jeremy , title=Putting engineering back into protein engineering: Bioinformatic approaches to catalyst design , volume=14 , issue=4 , pages=366–70 , journal=Current Opinion in Biotechnology , doi=10.1016/S0958-1669(03)00101-0


External links


Infologs information page at DNA2.0

Enzyme Engineering Conference Presentation: "Using Infologs to Engineer Biological Systems"
Protein engineering Bioinformatics