HOME

TheInfoList



OR:

In
computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...
, genetic representation is a way of presenting solutions/individuals in
evolutionary computation Evolutionary computation from computer science is a family of algorithms for global optimization inspired by biological evolution, and the subfield of artificial intelligence and soft computing studying these algorithms. In technical terms ...
methods. The term encompasses both the concrete
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
s and
data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s used to realize the genetic material of the candidate solutions in the form of a genome, and the relationships between search space and problem space. In the simplest case, the search space corresponds to the problem space (direct representation). The choice of problem representation is tied to the choice of
genetic operator A genetic operator is an Operator (programming), operator used in evolutionary algorithms (EA) to guide the algorithm towards a solution to a given problem. There are three main types of operators (Mutation (evolutionary algorithm) , mutation, Cro ...
s, both of which have a decisive effect on the efficiency of the optimization. Genetic representation can encode appearance, behavior, physical qualities of individuals. Difference in genetic representations is one of the major criteria drawing a line between known classes of evolutionary computation. Terminology is often analogous with natural
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinians, Augustinian ...
. The block of computer memory that represents one candidate solution is called an individual. The data in that block is called a
chromosome A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
. Each chromosome consists of genes. The possible values of a particular gene are called
allele An allele is a variant of the sequence of nucleotides at a particular location, or Locus (genetics), locus, on a DNA molecule. Alleles can differ at a single position through Single-nucleotide polymorphism, single nucleotide polymorphisms (SNP), ...
s. A programmer may represent all the individuals of a
population Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...
using ''binary encoding'', ''permutational encoding'', ''encoding by tree'', or any one of several other representations.


Representations in some popular evolutionary algorithms

Genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...
s (GAs) are typically linear representations; these are often, but not always, binary. Holland's original description of GA used arrays of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size. This facilitates simple crossover operation. Depending on the application, variable-length representations have also been successfully used and tested in
evolutionary algorithm Evolutionary algorithms (EA) reproduce essential elements of the biological evolution in a computer algorithm in order to solve "difficult" problems, at least Approximation, approximately, for which no exact or satisfactory solution methods are k ...
s (EA) in general and
genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...
s in particular, although the implementation of crossover is more complex in this case.
Evolution strategy Evolution strategy (ES) from computer science is a subclass of evolutionary algorithms, which serves as an optimization (mathematics), optimization technique. It uses the major genetic operators mutation (evolutionary algorithm), mutation, recomb ...
uses linear real-valued representations, e.g., an array of real values. It uses mostly gaussian mutation and blending/averaging crossover.
Genetic programming Genetic programming (GP) is an evolutionary algorithm, an artificial intelligence technique mimicking natural evolution, which operates on a population of programs. It applies the genetic operators selection (evolutionary algorithm), selection a ...
(GP) pioneered tree-like representations and developed
genetic operator A genetic operator is an Operator (programming), operator used in evolutionary algorithms (EA) to guide the algorithm towards a solution to a given problem. There are three main types of operators (Mutation (evolutionary algorithm) , mutation, Cro ...
s suitable for such representations. Tree-like representations are used in GP to represent and evolve functional programs with desired properties.
Human-based genetic algorithm In evolutionary computation, a human-based genetic algorithm (HBGA) is a genetic algorithm that allows humans to contribute solution suggestions to the evolutionary process. For this purpose, a HBGA has human interfaces for initialization, mutation, ...
(HBGA) offers a way to avoid solving hard representation problems by outsourcing all genetic operators to outside agents, in this case, humans. The algorithm has no need for knowledge of a particular fixed genetic representation as long as there are enough external agents capable of handling those representations, allowing for free-form and evolving genetic representations.


Common genetic representations

* binary array * integer or real-valued array *
binary tree In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the ''left child'' and the ''right child''. That is, it is a ''k''-ary tree with . A recursive definition using set theor ...
*
natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...
*
parse tree A parse tree or parsing tree (also known as a derivation tree or concrete syntax tree) is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is use ...
* directed graph


Distinction between search space and problem space

Analogous to biology, EAs distinguish between problem space (corresponds to
phenotype In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
) and search space (corresponds to
genotype The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a ...
). The problem space contains concrete solutions to the problem being addressed, while the search space contains the encoded solutions. The mapping from search space to problem space is called genotype-phenotype mapping. The
genetic operator A genetic operator is an Operator (programming), operator used in evolutionary algorithms (EA) to guide the algorithm towards a solution to a given problem. There are three main types of operators (Mutation (evolutionary algorithm) , mutation, Cro ...
s are applied to elements of the search space, and for evaluation, elements of the search space are mapped to elements of the problem space via genotype-phenotype mapping.


Relationships between search space and problem space

The importance of an appropriate choice of search space for the success of an EA application was recognized early on. The following requirements can be placed on a suitable search space and thus on a suitable genotype-phenotype mapping:


Completeness

All possible admissible solutions must be contained in the search space.


Redundancy

When more possible genotypes exist than phenotypes, the genetic representation of the EA is called ''redundant''. In nature, this is termed a degenerate
genetic code Genetic code is a set of rules used by living cell (biology), cells to Translation (biology), translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished ...
. In the case of a redundant representation, ''neutral mutations'' are possible. These are mutations that change the genotype but do not affect the phenotype. Thus, depending on the use of the
genetic operator A genetic operator is an Operator (programming), operator used in evolutionary algorithms (EA) to guide the algorithm towards a solution to a given problem. There are three main types of operators (Mutation (evolutionary algorithm) , mutation, Cro ...
s, there may be phenotypically unchanged offspring, which can lead to unnecessary fitness determinations, among other things. Since the evaluation in real-world applications usually accounts for the lion's share of the computation time, it can slow down the
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
process. In addition, this can cause the population to have higher genotypic diversity than phenotypic diversity, which can also hinder evolutionary progress. In biology, the Neutral Theory of Molecular Evolution states that this effect plays a dominant role in natural evolution. This has motivated researchers in the EA community to examine whether neutral mutations can improve EA functioning by giving populations that have converged to a local optimum a way to escape that local optimum through
genetic drift Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene va ...
. This is discussed controversially and there are no conclusive results on neutrality in EAs. On the other hand, there are other proven measures to handle premature convergence.


Locality

The locality of a genetic representation corresponds to the degree to which distances in the search space are preserved in the problem space after genotype-phenotype mapping. That is, a representation has a high locality exactly when neighbors in the search space are also neighbors in the problem space. In order for successful schemata not to be destroyed by genotype-phenotype mapping after a minor
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
, the locality of a representation must be high.


Scaling

In genotype-phenotype mapping, the elements of the genotype can be scaled (weighted) differently. The simplest case is uniform scaling: all elements of the genotype are equally weighted in the phenotype. A common scaling is exponential. If
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
s are binary coded, the individual digits of the resulting binary number have exponentially different weights in representing the phenotype. :Example: The number 90 is written in binary (i.e., in base two) as 1011010. If now one of the front digits is changed in the binary notation, this has a significantly greater effect on the coded number than any changes at the rear digits (the selection pressure has an exponentially greater effect on the front digits). For this reason, exponential scaling has the effect of randomly fixing the "posterior" locations in the genotype before the population gets close enough to the optimum to adjust for these subtleties.


Hybridization and repair in genotype-phenotype mapping

When mapping the genotype to the phenotype being evaluated, domain-specific knowledge can be used to improve the phenotype and/or ensure that constraints are met. This is a commonly used method to improve EA performance in terms of runtime and solution quality. It is illustrated below by two of the three examples.


Examples


Example of a direct representation

An obvious and commonly used encoding for the
traveling salesman problem In the theory of computational complexity, the travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exac ...
and related tasks is to number the cities to be visited consecutively and store them as integers in the
chromosome A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
. The
genetic operator A genetic operator is an Operator (programming), operator used in evolutionary algorithms (EA) to guide the algorithm towards a solution to a given problem. There are three main types of operators (Mutation (evolutionary algorithm) , mutation, Cro ...
s must be suitably adapted so that they only change the order of the cities (genes) and do not cause deletions or duplications. Thus, the gene order corresponds to the city order and there is a simple one-to-one mapping.


Example of a complex genotype-phenotype mapping

In a scheduling task with heterogeneous and partially alternative resources to be assigned to a set of subtasks, the genome must contain all necessary information for the individual scheduling operations or it must be possible to derive them from it. In addition to the order of the subtasks to be executed, this includes information about the resource selection. A phenotype then consists of a list of subtasks with their start times and assigned resources. In order to be able to create this, as many allocation matrices must be created as resources can be allocated to one subtask at most. In the simplest case this is one resource, e.g., one machine, which can perform the subtask. An allocation matrix is a two-dimensional matrix, with one dimension being the available time units and the other being the resources to be allocated. Empty matrix cells indicate availability, while an entry indicates the number of the assigned subtask. The creation of allocation matrices ensures firstly that there are no inadmissible multiple allocations. Secondly, the start times of the subtasks can be read from it as well as the assigned resources. A common constraint when scheduling resources to subtasks is that a resource can only be allocated once per time unit and that the reservation must be for a contiguous period of time. To achieve this in a timely manner, which is a common optimization goal and not a constraint, a simple
heuristic A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...
can be used: Allocate the required resource for the desired time period as early as possible, avoiding duplicate reservations. The advantage of this simple procedure is twofold: it avoids the constraint and helps the optimization. If the scheduling problem is modified to the scheduling of workflows instead of independent subtasks, at least some of the work steps of a workflow have to be executed in a given order. If the previously described scheduling heuristic now determines that the predecessor of a work step is not completed when it should be started itself, the following repair mechanism can help: Postpone the scheduling of this work step until all its predecessors are finished. Since the genotype remains unchanged and repair is performed only at the phenotype level, it is also called '' phenotypic repair''.


Example of a heuristic-based genotype-phenotype mapping

The following layout planning task is intended to illustrate a different use of a
heuristic A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...
in genotype-phenotype mapping: On a rectangular surface different geometric types of objects are to be arranged in such a way that as little area as possible remains unused. The objects can be rotated, must not overlap after placement, and must be positioned completely on the surface. A related application would be scrap minimization when cutting parts from a steel plate or fabric sheet. The coordinates of the centers of the objects and a rotation angle reduced to possible isomorphisms of the geometry of the objects can be considered as variables to be determined. If this is done directly by an EA, there will probably be a lot of overlaps. To avoid this, only the angle and the coordinate of one side of the rectangle are determined by the EA. Each object is now rotated and positioned on the edge of that side, shifting it if necessary so that it is inside the rectangle when it is subsequently moved. Then it is moved parallel to the other side until it touches another object or reaches the opposite end of the rectangle. In this way, overlaps are avoided and the unused area is reduced per placement, but not in general, which is left to optimization.


References

{{DEFAULTSORT:Genetic Representation Evolutionary algorithms