Abstract semantic graph
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, an abstract semantic
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
(ASG) or term graph is a form of
abstract syntax In computer science, the abstract syntax of data is its structure described as a data type (possibly, but not necessarily, an abstract data type), independent of any particular representation or encoding. This is particularly used in the representa ...
in which an
expression Expression may refer to: Linguistics * Expression (linguistics), a word, phrase, or sentence * Fixed expression, a form of words with a specific meaning * Idiom, a type of fixed expression * Metaphorical expression, a particular word, phrase, o ...
of a formal or
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
is represented by a
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
whose vertices are the expression's subterms. An ASG is at a higher level of abstraction than an
abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring ...
(or AST), which is used to express the
syntactic structure In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituency) ...
of an expression or
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Progra ...
. ASGs are more complex and concise than ASTs because they may contain shared subterms (also known as "common subexpressions"). Abstract semantic graphs are often used as an
intermediate representation An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
by
compilers In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
to store the results of performing
common subexpression elimination In compiler theory, common subexpression elimination (CSE) is a compiler optimization that searches for instances of identical expressions (i.e., they all evaluate to the same value), and analyzes whether it is worthwhile replacing them with a sing ...
upon abstract syntax trees. ASTs are
trees In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are u ...
and are thus incapable of representing shared terms. ASGs are usually directed acyclic graphs (DAG), although in some applications graphs containing cycles may be permitted. For example, a graph containing a cycle might be used to represent the
recursive Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
expressions that are commonly used in
functional programming language In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that m ...
s as non-
loop Loop or LOOP may refer to: Brands and enterprises * Loop (mobile), a Bulgarian virtual network operator and co-founder of Loop Live * Loop, clothing, a company founded by Carlos Vasquez in the 1990s and worn by Digable Planets * Loop Mobile, an ...
ing
iteration Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...
constructs. The mutability of these types of graphs, is studied in the field of
graph rewriting In computer science, graph transformation, or graph rewriting, concerns the technique of creating a new graph out of an original graph algorithmically. It has numerous applications, ranging from software engineering (software construction and also ...
. The nomenclature ''term graph'' is associated with the field of term graph rewriting, which involves the transformation and processing of expressions by the specification of rewriting rules, whereas ''abstract semantic graph'' is used when discussing
linguistics Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguis ...
,
programming languages A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
,
type systems In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type (computer science), type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constru ...
and
compilation Compilation may refer to: *In computer programming, the translation of source code into object code by a compiler **Compilation error **Compilation unit *Product bundling, a marketing strategy used to sell multiple products *Compilation thesis M ...
. Abstract syntax trees are not capable of sharing subexpression nodes because it is not possible for a node in a proper tree to have more than one parent. Although this conceptual simplicity is appealing, it may come at the cost of redundant representation and, in turn, possibly inefficiently duplicating the computation of identical terms. For this reason ASGs are often used as an
intermediate language An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A " ...
at a subsequent compilation stage to abstract syntax tree construction via parsing. An abstract semantic graph is typically constructed from an abstract syntax tree by a process of enrichment and abstraction. The enrichment can for example be the addition of
back-pointer In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer ''ref ...
s, edges from an
identifier An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique ''class'' of objects, where the "object" or class may be an idea, physical countable object (or class thereof), or physical noncountable ...
node (where a
variable Variable may refer to: * Variable (computer science), a symbolic name associated with a value and whose associated value may be changed * Variable (mathematics), a symbol that represents a quantity in a mathematical expression, as used in many ...
is being used) to a node representing the
declaration Declaration may refer to: Arts, entertainment, and media Literature * ''Declaration'' (book), a self-published electronic pamphlet by Michael Hardt and Antonio Negri * ''The Declaration'' (novel), a 2008 children's novel by Gemma Malley Music ...
of that variable. The abstraction can
entail In English common law, fee tail or entail is a form of trust established by deed or settlement which restricts the sale or inheritance of an estate in real property and prevents the property from being sold, devised by will, or otherwise alien ...
the removal of details which are relevant only in
parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lati ...
, not for semantics.


Example: Code Refactoring

For example, consider the case of
code refactoring In computer programming and software design, code refactoring is the process of restructuring existing computer code—changing the '' factoring''—without changing its external behavior. Refactoring is intended to improve the design, structur ...
. To represent the implementation of a function that takes an input argument, the received parameter is conventionally given an arbitrary, distinct
name A name is a term used for identification by an external observer. They can identify a class or category of things, or a single thing, either uniquely, or within a given context. The entity identified by a name is called its referent. A personal ...
in the source code so that it can be referenced. The abstract representation of this conceptual entity, a "function argument" instance, will likely be mentioned in the function signature, and also one or more times within the implementation code body. Since the function as a whole is the parent of both its header or "signature" information as well as its implementation body, an AST would not be able to use the same node to co-identify the multiple uses or appearances of the argument entity. This is solved by the DAG nature of an ASG. A key advantage of having a single, distinct node identity for any given code element is that each element's properties are, by definition, uniquely stored. This simplifies refactoring operations, because there is exactly one existential nexus for any given property instantiation. If the developer decides to change a property value such as the "name" of any code element (the "function argument" in this example), the ASG inherently exposes that value in exactly one place, and it follows that any such property changes are implicitly, trivially, and immediately propagated globally.


See also

*
Abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring ...
*
Ontology (computer science) In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains ...
* Semantic Web * Semantic Grid


References


External links

* * * * Graph data structures Formal languages {{formalmethods-stub