Referring expression generation (REG) is the subtask of

natural language generation Natural language generation (NLG) is a software process that produces natural language output. In one of the most widely-cited survey of NLG methods, NLG is characterized as "the subfield of artificial intelligence and computational linguistics tha ...

(NLG) that received most scholarly attention. While NLG is concerned with the conversion of non-linguistic information into natural language, REG focuses only on the creation of

referring expression In linguistics, a referring expression (RE) is any noun phrase, or surrogate for a noun phrase, whose function in discourse is to identify some individual object. The technical terminology for ''identify'' differs a great deal from one school of ...

s (noun phrases) that identify specific entities called ''targets''. This task can be split into two sections. The ''content selection'' part determines which set of properties distinguish the intended target and the ''linguistic realization'' part defines how these properties are translated into natural language. A variety of algorithms have been developed in the NLG community to generate different types of referring expressions.

Types of referring expressions

(RE), in linguistics, is any noun phrase, or surrogate for a noun phrase, whose function in discourse is to ''identify'' some individual object (thing, being, event...) The

technical terminology Jargon is the specialized terminology associated with a particular field or area of activity. Jargon is normally employed in a particular Context (language use), communicative context and may not be well understood outside that context. The conte ...

for ''identify'' differs a great deal from one school of linguistics to another. The most widespread term is probably ''refer'', and a thing identified is a ''referent'', as for example in the work of John Lyons. In linguistics, the study of reference relations belongs to

pragmatics In linguistics and related fields, pragmatics is the study of how context contributes to meaning. The field of study evaluates how human language is utilized in social interactions, as well as the relationship between the interpreter and the int ...

, the study of language use, though it is also a matter of great interest to philosophers, especially those wishing to understand the nature of

knowledge Knowledge can be defined as awareness of facts or as practical skills, and may also refer to familiarity with objects or situations. Knowledge of facts, also called propositional knowledge, is often defined as true belief that is distinc ...

perception Perception () is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information or environment. All perception involves signals that go through the nervous system ...

and

cognition Cognition refers to "the mental action or process of acquiring knowledge and understanding through thought, experience, and the senses". It encompasses all aspects of intellectual functions and processes such as: perception, attention, thought, ...

more generally. Various devices can be used for reference:

determiner A determiner, also called determinative (abbreviated ), is a word, phrase, or affix that occurs together with a noun or noun phrase and generally serves to express the reference of that noun or noun phrase in the context. That is, a determiner m ...

pronoun In linguistics and grammar, a pronoun (abbreviated ) is a word or a group of words that one may substitute for a noun or noun phrase. Pronouns have traditionally been regarded as one of the parts of speech, but some modern theorists would not co ...

s, proper names... Reference relations can be of different kinds; referents can be in a "real" or imaginary world, in discourse itself, and they may be singular, plural, or collective.

Pronouns

The simplest type of referring expressions are

such as ''he'' and ''it''. The linguistics and natural language processing communities have developed various models for predicting anaphor referents, such as centering theory,M Poesio, R Stevenson, B di Eugenio, J Hitzeman (2004). Centering: A Parametric Theory and Its Instantiations. ''Computational Linguistics'' 30:309-36

/ref> and ideally referring-expression generation would be based on such models. However most NLG systems use much simpler algorithms, for example using a pronoun if the referent was mentioned in the previous sentence (or sentential clause), and no other entity of the same gender was mentioned in this sentence.

Definite noun phrases

There has been a considerable amount of research on generating definite noun phrases, such as ''the big red book''. Much of this builds on the model proposed by Dale and Reiter. This has been extended in various ways, for example Krahmer ''et al.''E Krahmer, S van Erk, A Verleg (2003). Graph-Based Generation of Referring Expressions. Computational Linguistics 23:53-7

/ref> present a graph-theoretic model of definite NP generation with many nice properties. In recent years a shared-task event has compared different algorithms for definite NP generation, using the TUNA corpus.

Spatial and temporal reference

Recently there has been more research on generating referring expressions for time and space. Such references tend to be imprecise (what is the exact meaning of ''tonight''?), and also to be interpreted in different ways by different people. Hence it may be necessary to explicitly reason about false positive vs false negative tradeoffs, and even calculate the utility of different possible referring expressions in a particular task context.R Turner, Y Sripada, E Reiter (2009) Generating Approximate Geographic Descriptions. ''Proceedings of the 12th European Workshop on Natural Language Generation (ENLG)'', pages 42–49, Athens

/ref>

Criteria for good expressions

Ideally, a good referring expression should satisfy a number of criteria: * ''Referential success'': It should unambiguously identify the referent to the reader. * ''Ease of comprehension'': The reader should be able to quickly read and understand it. * ''Computational complexity'': The generation algorithm should be fast * ''No false inferences'': The expression should not confuse or mislead the reader by suggesting false implicatures or other pragmatic inferences. For example, a reader may be confused if he is told ''Sit by the brown wooden table'' in a context where there is only one table.R Dale, E Reiter (1995). Computational interpretations of the Gricean maxims in the generation of referring expressions. ''Cognitive Science'', 18:233–263.

History

Pre-2000 era

REG goes back to the early days of NLG. One of the first approaches was done by Winograd in 1972 who developed an " incremental" REG algorithm for his

SHRDLU SHRDLU was an early natural-language understanding computer program, developed by Terry Winograd at MIT in 1968–1970. In the program, the user carries on a conversation with the computer, moving objects, naming collections and querying the st ...

program. Afterwards researchers started to model the human abilities to create referring expressions in the 1980s. This new approach to the topic was influenced by the researchers Appelt and Kronfeld who created the programs KAMP and BERTRANDD Appelt (1985). Planning English referring expressions. ''Artificial Intelligence'', 26:1–33.D Appelt, A Kronfeld (1987). A computational model of referring. ''In Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI)'', pages 640–647, Milan. and considered referring expressions as parts of bigger speech acts. Some of their most interesting findings were the fact that referring expressions can be used to add information beyond the identification of the referent as well as the influence of communicative context and the

Gricean maxim In social science generally and linguistics specifically, the cooperative principle describes how people achieve effective conversational communication in common social situations—that is, how listeners and speakers act cooperatively and mutua ...

s on referring expressions. Furthermore, its skepticism concerning the naturalness of minimal descriptions made Appelt and Kronfeld's research a foundation of later work on REG. The search for simple, well-defined problems changed the direction of research in the early 1990s. This new approach was led by Dale and Reiter who stressed the identification of the referent as the central goal.R Dale (1989). Cooking up referring expressions. ''In Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics (ACL)'', pages 68–75.R Dale (1992). Generating Referring Expressions: Constructing Descriptions in a Domain of Objects and Processes. ''TheMIT Press'', Cambridge, MA. Like Appelt they discuss the connection between the

s and referring expressions in their culminant paper in which they also propose a formal problem definition. Furthermore, Reiter and Dale discuss the Full Brevity and Greedy Heuristics algorithms as well as their

Incremental Algorithm Increment or incremental may refer to: * Incrementalism, a theory (also used in politics as a synonym for gradualism) * Increment and decrement operators, the operators ++ and -- in computer programming * Incremental computing * Incremental backu ...

(IA) which became one of the most important algorithms in REG. This section is an excerpt of the following paper. For more details see: E Krahmer, K van Deemter (2012). Computational Generation of Referring Expressions: A Survey. ''Computational Linguistics'' 38:173-21

/ref>

Later developments

After 2000 the research began to lift some of the simplifying assumptions, that had been made in early REG research in order to create more simple algorithms. Different research groups concentrated on different limitations creating several expanded algorithms. Often these extend the IA in a single perspective for example in relation to: * ''Reference to Sets'' like "the t-shirt wearers" or "the green apples and the banana on the left" * ''Relational Descriptions'' like "the cup on the table" or "the woman who has three children"R Dale, N Haddock (1991). Generating referring expressions involving relations. ''Proceedings of the 5th Conference of the European Chapter of the Association of Computational Linguists (EACL)'', pages 161–166, Berlin.J Viethen, R Dale (2008). The use of spatial relations in referring expressions. ''Proceedings of the 5th International Natural Language Generation Conference (INLG)'', pages 59–67, Salt Fork, OH.E Krahmer, M Goudbeek, M Theune (2014). Referring Expression Generation in Interaction: A Graph-based perspective. ''A Stent, S Bangalore (eds.), Natural Language Generation in Interactive Systems''. Cambridge University Press. * ''Context Dependency'', ''Vagueness'' and ''Gradeability'' include statements like "the older man" or "the car on the left" which are often unclear without a context * ''Salience'' and ''Generation of Pronouns'' are highly discourse dependent making for example "she" a reference to "the (most salient) female person" Many simplifying assumptions are still in place or have just begun to be worked on. Also a combination of the different extensions has yet to be done and is called a "non-trivial enterprise" by Krahmer and van Deemter.E Krahmer, K van Deemter (2012). Computational Generation of Referring Expressions: A Survey. ''Computational Linguistics'' 38:173-21

/ref> Another important change after 2000 was the increasing use of #Evaluation of REG systems, empirical studies in order to evaluate algorithms. This development took place due to the emergence of transparent corpora. Although there are still discussions about what the best evaluation metrics are, the use of experimental evaluation has already led to a better comparability of algorithms, a discussion about the goals of REG and more task-oriented research. Furthermore, research has extended its range to related topics such as the choice of ''

Knowledge Representation Knowledge representation and reasoning (KRR, KR&R, KR²) is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can use to solve complex tasks such as diagnosing a medic ...

(KR) Frameworks''. In this area the main question, which KR framework is most suitable for the use in REG remains open. The answer to this question depends on how well descriptions can be expressed or found. A lot of the potential of KR frameworks has been left unused so far. Some of the different approaches are the usage of: * ''

Graph search In computer science, graph traversal (also known as graph search) refers to the process of visiting (checking and/or updating) each vertex in a graph. Such traversals are classified by the order in which the vertices are visited. Tree traversal ...

'' which treats relations between targets in the same way as properties. * '' Constraint Satisfaction'' which allows for a separation between problem specification and the implementation. * ''Modern Knowledge Representation'' which offers

logical inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word ''wikt:infer, infer'' means to "carry forward". Inference is theoretically traditionally divided into deductive reasoning, deduction and in ...

in for example Description Logic or

Conceptual Graph A conceptual graph (CG) is a formalism for knowledge representation. In the first published paper on CGs, John F. Sowa used them to represent the conceptual schemas used in database systems. The first book on CGs applied them to a wide range of ...

Problem definition

Dale and Reiter (1995) think about referring expressions as distinguishing descriptions. They define: * The ''referent'' as the entity that should be described * The ''context set'' as set of salient entities * The ''contrast set'' or ''potential distractors'' as all elements of the context set except the referent * A ''property'' as a reference to a single attribute–value pair Each entity in the domain can be characterised as a set of attribute–value pairs for example

\langle

type, dog

\rangle

\langle

gender, female

\rangle

\langle

age, 10 years

\rangle

. The problem then is defined as follows: Let

r

be the intended referent, and

C

be the contrast set. Then, a set

L

of attribute–value pairs will represent a distinguishing description if the following two conditions hold: # Every attribute–value pair in

L

applies to

r

: that is, every element of

L

specifies an attribute–value that

r

possesses. # For every member

c

C

, there is at least one element

l

L

that does not apply to

c

: that is, there is an

l

L

that specifies an attribute–value that

c

does not possess.

l

is said to rule out

c

. In other words, to generate a referring expression one is looking for a set of properties that apply to the referent but not to the distractors. The problem could be easily solved by conjoining all the properties of the referent which often leads to long descriptions violating the second Gricean Maxim of Quantity. Another approach would be to find the shortest distinguishing description like the Full Brevity algorithm does. Yet in practice it is most common to instead include the condition that referring expressions produced by an algorithm should be as similar to human-produced ones as possible although this is often not explicitly mentioned.

Basic algorithms

Full Brevity

The Full Brevity algorithm always finds a minimal distinguishing description meaning there is no shorter distinguishing description in regard to properties used. Therefore, it iterates over

n=1,2,3,4,...

and checks every description of a length of

n

properties until a distinguishing description is found. Two problems arise from this way of creating referring expressions. Firstly the algorithm has a high complexity meaning it is

NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...

which makes it impractical to use. Secondly human speakers produce descriptions that are not minimal in many situations.

Greedy Heuristics

The Greedy Heuristics algorithm approximates the Full Brevity algorithm by iteratively adding the most distinguishing property to the description. The most distinguishing property means the property that rules out most of the remaining distractors. The Greedy Heuristics algorithm is more efficient than the Full Brevity algorithm. Dale and Reiter(1995) present the following algorithm for the Greedy Heuristic: Let

L

be the set of properties to be realised in our description; let

P

be the set of properties known to be true of our intended referent

r

(we assume that

P

is non-empty); and let

C

be the set of distractors (the contrast set). The initial conditions are thus as follows:

C = \

;

P = \

;

L = \

In order to describe the intended referent

r

with respect to the contrast set

C

, we do the following: 1. Check Success: if

, C,  = 0

then return

L

as a distinguishing description elseif

P = \empty

then fail else goto Step 2. 2. Choose Property: for each

p_i \in P

do:

C_i \leftarrow C \cap \

Chosen property is

p_j

, where

C_j

is the smallest set. goto Step 3. 3. Extend Description (wrt the chosen

p_j

L \leftarrow L \cup \

C \leftarrow C_j

P \leftarrow P - \

goto Step 1.

Incremental Algorithm

The Incremental Algorithm (IA) by Dale and Reiter was the most influential algorithm before 2000. It is based on the idea of a ''preferential'' order of attributes or properties that speakers go by. So in order to run the Incremental Algorithm, first a preference order of attributes has to be given. Now the algorithm follows that order and adds those properties to the description which rule out any remaining distractors. Furthermore, Dale and Reiter stress the attribute type which is always included in their descriptions even if it does not rule out any distractors. Also the type values are part of a ''subsumption hierarchy'' including some ''basic level values''. For example, in the ''pet'' domain ''chihuahua'' is subsumed by ''dog'' and ''dog'' by ''animal''. Because ''dog'' is defined as a basic level ''dog'' would be preferred by the algorithms, if ''chihuahua'' does not rule out any distractors. The Incremental Algorithm is easy to implement and also computationally efficient running in

polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...

. The description generated by the IA can contain redundant properties that are superfluous because of later added properties. The creators do not consider this as a weakness, but rather as making the expressions less "psycholinguistically implausible". The following algorithm is a simplified version of Dale and Reiter's Incremental Algorithm by Krahmer and van Deemter that takes as input the referent ''r'', the ''D'' containing a collection of domain objects and a domain-specific ordered list ''Pref'' of preferred attributes. In the notation ''L'' is the description, ''C'' the context set of distractors and the function RulesOut() returns the set of objects which have a value different to V for attribute A_i. IncrementalAlgorithm (, D, Pref) L ← ∅ C ← D - for each A_i in list Pref do V = Value(r, A_i) if C ∩ RulesOut() ≠ ∅ then L ← L ∪ C ← C - RulesOut() endif if C = ∅ then return L endif return failure

Evaluation of REG systems

Before 2000 evaluation of REG systems has been of theoretical nature like the one done by Dale and Reiter. More recently, empirical studies have become popular which are mostly based on the assumption that the generated expressions should be similar to human-produced ones. Corpus-based evaluation began quite late in REG due to a lack of suitable data sets. Still corpus-based evaluation is the most dominant method at the moment though there is also evaluation by human judgement.

Corpus-based evaluation

First the distinction between

text corpora In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical ...

and experimental corpora has to be made.

Text corpora In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical ...

like the GNOME corpus can contain texts from all kind of domains. In REG they are used to evaluate the ''realization'' part of algorithms. The ''content selection'' part of REG on the other hand requires a corpus that contains the properties of all domain objects as well as the properties used in references. Typically those fully "semantically transparent" created in experiments using simple and controlled settings. These experimental corpora once again can be separated into ''General-Purpose Corpora'' that were collected for another purpose but have been analysed for referring expressions and ''Dedicated Corpora'' that focus specifically on referring expressions. Examples of General-Purpose Corpora are the Pear Stories, the Map Task corpus or the Coconut corpus while the Bishop corpus, the Drawer corpus and the TUNA corpusA Gatt, I van der Sluis, K van Deemter (2007). Evaluating algorithms for the generation of referring expressions using a balanced corpus. ''Proceedings of the 11th European Workshop on Natural Language Generation (ENLG)'', pages 49–56, Schloss Dagstuhl. count to the Dedicated Corpora. The TUNA corpus which contains web-collected data on the two domains furniture and people has been used in three shared REG challenges already.

Evaluation metrics

To measure the correspondence between corpora and the results of REG algorithms several Metrics have been developed. To measure the ''content selection'' part the

Dice coefficient Dice (singular die or dice) are small, throwable objects with marked sides that can rest in multiple positions. They are used for generating random values, commonly as part of tabletop games, including dice games, board games, role-playing ga ...

or the MASI (Measuring Agreement on Set-valued Items) metric are used. These measure the overlap of properties in two descriptions. In an evaluation the scores are usually averaged over references made by different human participants in the corpus. Also sometimes a measure called Perfect Recall Percentage (PRP) or Accuracy is used which calculates the percentage of perfect matches between an algorithm-produced and a human-produced reference. For the ''linguistic realization'' part of REG the overlap between strings has been measured using metrics like BLEU or

NIST The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical sci ...

. A problem that occurs with string-based metrics is that for example "The small monkey" is measured closer to "The small donkey" than to "The little monkey". A more time consuming way to evaluate REG algorithms is by letting humans judge the ''Adequacy'' (How clear is the description?) and ''Fluency'' (Is the description given in good and clear English?) of the generated expression. Also Belz and GattA Belz, A Gatt (2008). Intrinsic vs. extrinsic evaluation measures for referring expression generation. ''Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL)'', Columbus, OH. evaluated referring expressions using an experimental setup. The participants get a generated description and then have to click on the target. Here the extrinsic metrics reading time, identification time and error rate could be evaluated.

Notes

References

{{DEFAULTSORT:Referring Expression Generation Computational linguistics Natural language processing