HOME

TheInfoList



OR:

Language identification in the limit is a formal model for
inductive inference Inductive reasoning is a method of reasoning in which a general principle is derived from a body of observations. It consists of making broad generalizations based on specific observations. Inductive reasoning is distinct from ''deductive'' rea ...
of
formal language In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of a formal language consists of sym ...
s, mainly by computers (see
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machin ...
and induction of regular languages). It was introduced by E. Mark Gold in a technical report and a journal article with the same title. In this model, a ''teacher'' provides to a ''learner'' some ''presentation'' (i.e. a sequence of strings) of some formal language. The learning is seen as an infinite process. Each time the learner reads an element of the presentation, it should provide a ''representation'' (e.g. a
formal grammar In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
) for the language. Gold defines that a learner can ''identify in the limit'' a class of languages if, given any presentation of any language in the class, the learner will produce only a finite number of wrong representations, and then stick with the correct representation. However, the learner need not be able to announce its correctness; and the teacher might present a counterexample to any representation arbitrarily long after. Gold defined two types of presentations: * Text (positive information): an enumeration of all strings the language consists of. * Complete presentation (positive and negative information): an enumeration of all possible strings, each with a label indicating if the string belongs to the language or not.


Learnability

This model is an early attempt to formally capture the notion of
learnability Learnability is a quality of products and interfaces that allows users to quickly become familiar with them and able to make good use of all their features and capabilities. Software testing In software testing learnability, according to ISO/IEC 91 ...
. Gold's journal article introduces for contrast the stronger models * ''Finite identification'' (where the learner has to announce correctness after a finite number of steps), and * ''Fixed-time identification'' (where correctness has to be reached after an apriori-specified number of steps). A weaker formal model of learnability is the '' Probably approximately correct learning (PAC)'' model, introduced by
Leslie Valiant Leslie Gabriel Valiant (born 28 March 1949) is a British American computer scientist and computational theorist. He was born to a chemical engineer father and a translator mother. He is currently the T. Jefferson Coolidge Professor of Compu ...
in 1984.


Examples

It is instructive to look at concrete examples (in the tables) of learning sessions the definition of identification in the limit speaks about. # A fictitious session to learn a
regular language In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...
''L'' over the
alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a sy ...
from text presentation:
In each step, the teacher gives a string belonging to ''L'', and the learner answers a guess for ''L'', encoded as a
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
."''A''+''B''" contains all strings that are in ''A'' or in ''B''; "''AB''" contains all concatenations of a string in ''A'' with a string in ''B''; "''A''*" contains all repetitions (zero or more times) of strings in ''A''; "ε" denotes the empty string; "a" and "b" denote themselves. For example, the expression "(''ab''+''ba'')*" in step 7 denotes the infinite set . In step 3, the learner's guess is not consistent with the strings seen so far; in step 4, the teacher gives a string repeatedly. After step 6, the learner sticks to the regular expression (''ab''+''ba'')*. If this happens to be a description of the language ''L'' the teacher has in mind, it is said that the learner has learned that language.
If a computer program for the learner's role would exist that was able to successfully learn each regular language, that class of languages would be ''identifiable in the limit''. Gold has shown that this is not the case. # A particular learning algorithm always guessing ''L'' to be just the union of all strings seen so far:
If ''L'' is a finite language, the learner will eventually guess it correctly, however, without being able to tell when. Although the guess didn't change during step 3 to 6, the learner couldn't be sure to be correct.
Gold has shown that the class of finite languages is identifiable in the limit, however, this class is neither finitely nor fixed-time identifiable. # Learning from complete presentation by telling:
In each step, the teacher gives a string and tells whether it belongs to ''L'' () or not (). Each possible string is eventually classified in this way by the teacher. # Learning from complete presentation by request:
The learner gives a query string, the teacher tells whether it belongs to ''L'' () or not (); the learner then gives a guess for ''L'', followed by the next query string. In this example, the learner happens to query in each step just the same string as given by the teacher in example 3.
In general, Gold has shown that each language class identifiable in the request-presentation setting is also identifiable in the telling-presentation setting, since the learner, instead of querying a string, just needs to wait until it is eventually given by the teacher.


Learnability characterization

Dana Angluin gave the characterizations of learnability from text (positive information) in a 1980 paper. If a learner is required to be effective, then an indexed class of
recursive language In mathematics, logic and computer science, a formal language (a set of finite sequences of symbols taken from a fixed alphabet) is called recursive if it is a recursive subset of the set of all possible finite sequences over the alphabet of ...
s is learnable in the limit if there is an effective procedure that uniformly enumerates ''tell-tales'' for each language in the class (Condition 1).p.121 top It is not hard to see that if an ideal learner (i.e., an arbitrary function) is allowed, then an indexed class of languages is learnable in the limit if each language in the class has a tell-tale (Condition 2).


Language classes learnable in the limit

The table shows which language classes are identifiable in the limit in which learning model. On the right-hand side, each language class is a superclass of all lower classes. Each learning model (i.e. type of presentation) can identify in the limit all classes below it. In particular, the class of finite languages is identifiable in the limit by text presentation (cf. Example 2 above), while the class of regular languages is not. '' Pattern Languages'', introduced by Dana Angluin in another 1980 paper, are also identifiable by normal text presentation; they are omitted in the table, since they are above the singleton and below the primitive recursive language class, but incomparable to the classes in between.incomparable to regular and to context-free language class: Theorem 3.10, p.53


Sufficient conditions for learnability

Condition 1 in Angluin's paper is not always easy to verify. Therefore, people come up with various sufficient conditions for the learnability of a language class. See also '' Induction of regular languages'' for learnable subclasses of regular languages.


Finite thickness

A class of languages has finite thickness if every non-empty set of strings is contained in at most finitely many languages of the class. This is exactly Condition 3 in Angluin's paper. Angluin showed that if a class of
recursive language In mathematics, logic and computer science, a formal language (a set of finite sequences of symbols taken from a fixed alphabet) is called recursive if it is a recursive subset of the set of all possible finite sequences over the alphabet of ...
s has finite thickness, then it is learnable in the limit.p.123 bot, Corollary 2 A class with finite thickness certainly satisfies MEF-condition and MFF-condition; in other words, finite thickness implies M-finite thickness.; here: Proof of Corollary 29


Finite elasticity

A class of languages is said to have finite elasticity if for every infinite sequence of strings s_0, s_1, ... and every infinite sequence of languages in the class L_1, L_2, ..., there exists a finite number n such that s_n\not\in L_n implies L_n is inconsistent with \. It is shown that a class of
recursively enumerable In computability theory, a set ''S'' of natural numbers is called computably enumerable (c.e.), recursively enumerable (r.e.), semidecidable, partially decidable, listable, provable or Turing-recognizable if: *There is an algorithm such that the ...
languages is learnable in the limit if it has finite elasticity.


Mind change bound

A bound over the number of hypothesis changes that occur before convergence.


Other concepts


Infinite cross property

A language L has infinite cross property within a class of languages \mathcal if there is an infinite sequence L_i of distinct languages in \mathcal and a sequence of finite subset T_i such that: *T_1 \sub T_2\sub ..., *T_i \in L_i, *T_\not\in L_i, and *\lim_T_i=L. Note that L is not necessarily a member of the class of language. It is not hard to see that if there is a language with infinite cross property within a class of languages, then that class of languages has infinite elasticity.


Relations between concepts

*Finite thickness implies finite elasticity; the converse is not true. *Finite elasticity and conservatively learnable implies the existence of a mind change bound

*Finite elasticity and M-finite thickness implies the existence of a mind change bound. However, M-finite thickness alone does not imply the existence of a mind change bound; neither does the existence of a mind change bound imply M-finite thickness

*Existence of a mind change bound implies learnability; the converse is not true. *If we allow for noncomputable learners, then finite elasticity implies the existence of a mind change bound; the converse is not true. *If there is no accumulation order for a class of languages, then there is a language (not necessarily in the class) that has infinite cross property within the class, which in turn implies infinite elasticity of the class.


Open questions

*If a countable class of recursive languages has a mind change bound for noncomputable learners, does the class also have a mind change bound for computable learners, or is the class unlearnable by a computable learner?


Notes


References

{{reflist Formal languages Computational learning theory