Artificial grammar learning (AGL) is a paradigm of study within

cognitive psychology Cognitive psychology is the scientific study of human mental processes such as attention, language use, memory, perception, problem solving, creativity, and reasoning. Cognitive psychology originated in the 1960s in a break from behaviorism, whi ...

and

linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...

. Its goal is to investigate the processes that underlie human

language learning Language acquisition is the process by which humans acquire the capacity to perceive and comprehend language. In other words, it is how human beings gain the ability to be aware of language, to understand it, and to produce and use words and ...

by testing subjects' ability to learn a made-up grammar in a laboratory setting. It was developed to evaluate the processes of human language learning but has also been utilized to study

implicit learning Implicit learning is the learning of complex information in an unintentional manner, without awareness of what has been learned. According to Frensch and Rünger (2003) the general definition of implicit learning is still subject to some controvers ...

in a more general sense. The area of interest is typically the subjects' ability to detect patterns and statistical regularities during a training phase and then use their new knowledge of those patterns in a testing phase. The testing phase can either use the symbols or sounds used in the training phase or transfer the patterns to another set of symbols or sounds as surface structure. Many researchers propose that the rules of the artificial grammar are learned on an implicit level since the rules of the grammar are never explicitly presented to the participants. The paradigm has also recently been utilized for other areas of research such as language learning aptitude, structural priming and to investigate which brain structures are involved in

syntax In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...

acquisition and implicit learning. Apart from humans, the paradigm has also been used to investigate pattern learning in other species, e.g.

cottontop tamarin The cotton-top tamarin (''Saguinus oedipus'') is a small New World monkey weighing less than . This New World monkey can live up to 24 years, but most of them die by 13 years. One of the smallest primates, the cotton-top tamarin is easily recog ...

s and starlings.

History

More than half a century ago George A. Miller established the paradigm of artificial grammar learning in order to investigate the influence of explicit grammar structures on human learning, he designed a grammar model of letters with different sequences. His research demonstrated that it was easier to remember a structured grammar sequence than a random sequence of letters. His explanation was that learners could identify the common characteristics between learned sequences and accordingly encode them to a memory set. He predicted that subjects could identify which letters will most likely appear together as a sequence repeatedly and which letters would not and that the subjects would use this information to form memory sets. Those memory sets served participants as a strategy later on during their memory tests. Reber doubted Miller's explanation. He claimed that if participants could encode the grammar rules as productive memory sets, then they should be able to verbalize their strategy in detail. He conducted research that led to the development of the modern AGL paradigm. This research used a synthetic grammar learning model to test implicit learning. AGL became the most used and tested model in the field. As in the original paradigm developed by Miller, participants were asked to memorize a list of letter strings which were created from an artificial grammar rule model. It was only during the test phase that participants were told that there was a set of rules behind the letter sequences they memorized. They were then instructed to categorize new letter strings based on the same set of rules which they had not previously been exposed to. They classified new letter strings as "grammatical" (constructed from the grammar rule), vs. "randomly constructed" sequences. If subjects correctly sorted the new strings above chance level, it could be inferred that subjects had acquired the grammatical rule structure without any explicit instruction of the rules. Reber found that participants sorted out new strings above chance level. While they reported using strategies during the sorting task, they could not actually verbalize those strategies. Subjects could identify which strings were grammatically correct but could not identify the rules that composed grammatical strings. This research was replicated and expanded upon by many others. The conclusions of most of these studies were congruent with Reber's hypothesis: the implicit learning process was done with no intentional learning strategies. These studies also identified common characteristics for the implicitly acquired knowledge: #Abstract representation for the set of rules. #Unconscious strategies that can be tested with performance.

The modern paradigm

The modern AGL paradigm can be used to investigate explicit and implicit learning, although it is most often used to test implicit learning. In a typical AGL experiment, participants are required to memorize strings of letters previously generated by a specific

grammar In linguistics, grammar is the set of rules for how a natural language is structured, as demonstrated by its speakers or writers. Grammar rules may concern the use of clauses, phrases, and words. The term may also refer to the study of such rul ...

. The length of the strings usually ranges from 2-9 letters per string. An example of such a grammar is shown in figure 1. Artificial grammar learning example

Figure 1: Example of an artificial grammar rule * Ruleful strings: VXVS, TPTXVS * Unruleful strings: VXXXS, TPTPS In order to compose a grammatically "ruleful" string of letters, according to the predetermined grammar rule, a subject must follow the rules for the pairing of letters as represented in the model (figure 1). When observing a violation of the grammatical rule system that composes the string, it is considered an "unruleful" or randomly constructed string. In the case of a standard AGL implicit learning task, subjects are not told that the strings are based on a specific grammar. Instead, they are simply given the task to memorize the letter strings for a memory. After the learning phase, subjects are told that the letter strings presented during the learning phase were based on specific rules, but are not explicitly told what the rules are. During a test phase, the subjects are instructed to categorize new letter strings as "ruleful" or "unruleful". The dependent variable usually measured is the percentage of correctly categorized strings. Implicit learning is considered to be successful when the percentage of correctly sorted strings is significantly higher than chance level. If this significant difference is found, it indicates the existence of a learning process that is more involved than memorizing the presented letter strings.

Bayesian learning

The mechanism behind the implicit learning that is hypothesized to occur while people engage in artificial grammar learning is

statistical learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

or, more specifically,

Bayesian learning Bayesian inference ( or ) is a method of statistical inference in which Bayes' theorem is used to calculate a probability of a hypothesis, given prior evidence, and update it as more information becomes available. Fundamentally, Bayesian inferen ...

. Bayesian learning takes into account types of biases or "prior probability distributions" individuals have that contribute to the outcome of implicit learning tasks. These biases can be thought of as a probability distribution that contains the probability that each possible hypothesis is likely to be correct. Due to the structure of the Bayesian model, the inferences output by the model are in the form of a probability distribution rather than a single most probable event. This output distribution is a "posterior probability distribution". The posterior probability of each hypothesis in the original distribution is the probability of the hypothesis being true given the data and the probability of data given the hypothesis is true. This Bayesian model for learning is fundamental for understanding the pattern detection process involved in implicit learning and, therefore, the mechanisms that underlie the acquisition of artificial grammar learning rules. It is hypothesized that the implicit learning of grammar involves predicting co-occurrences of certain words in a certain order. For example, "the dog chased the ball" is a sentence that can be learned as grammatically correct on an implicit level due to the high co-occurrence of "chase" being one of the words to follow "dog". A sentence like "the dog cat the ball" is implicitly recognized as grammatically incorrect due to the lack of utterances that contain those words paired in that specific order. This process is important for teasing apart thematic roles and parts of speech in grammatical processing (see

). While the labeling of the thematic roles and parts of speech is explicit, the identification of words and parts of speech is implicit.

Explanatory models

Traditional approaches to AGL claim that the stored knowledge obtained during the learning phase is abstract. Other approaches argue that this stored knowledge is concrete and consists of exemplars of strings encountered during the learning phase or "chunks" of these exemplars. In any case, it is assumed that the information stored in memory is retrieved in the test phase and is used to aid decisions about letter strings. Three main approaches attempt to explain the AGL phenomena: # ''Abstract Approach'': According to this traditional approach, participants acquire an abstract representation of the artificial grammar rule in the learning stage. That abstract structure helps them to decide if the new string presented during the test phase is grammatical or randomly constructed. # ''Concrete knowledge approach'': This approach proposes that during the learning stage participants learn specific examples of strings and store them in their memory. During the testing stage, participants do not sort the new strings according to an abstract rule; instead they will sort them according to their similarity to the examples stored in memory from the learning stage. There are multiple opinions concerning how concrete the learned knowledge really is. Brooks & Vokey argue that all of the knowledge stored in memory is represented as concrete examples of the full examples studied during the learning stage. The strings are sorted during the testing stage according to a full representation of the string examples from the learning stage. On the other hand, Perruchet & Pacteau claimed that the knowledge of the strings from the learning stage is stored in the form of "memory chunks" where 2 - 3 letters are learned as a sequence along with knowledge about their permitted location in the full string. # ''Dual Factor approach'': Dual process learning model, combines the approaches described above. This approach proposes that a person will rely on concrete knowledge when they can. When they cannot rely on concrete knowledge (for example on a

transfer of learning Transfer of learning occurs when people apply information, strategies, and skills they have learned to a new situation or context. Transfer is not a discrete activity, but is rather an integral part of the learning process. Researchers attempt to ...

task), the person will use abstract knowledge of the rules. Research with

amnesia Amnesia is a deficit in memory caused by brain damage or brain diseases,Gazzaniga, M., Ivry, R., & Mangun, G. (2009) Cognitive Neuroscience: The biology of the mind. New York: W.W. Norton & Company. but it can also be temporarily caused by t ...

patients suggests the "Dual Factor approach" may be the most accurate model. A series of experiments with amnesiac patients support the idea that AGL involves both abstract concepts and concrete exemplars. Amnesiacs were able to classify stimuli as "grammatical" vs. "randomly constructed" just as well as participants in the control group. While able to successfully complete the task, amnesiacs were not able to explicitly recall grammatical "chunks" of the letter sequence while the control group was able to explicitly recall them. When performing the task with the same grammar rules but a different sequence of letters than those that they were previously tested on, both amnesiacs and the control group were able to complete the task (although performance was better when the task was completed using the same set of letters used for training). The results of the experiment support the dual factor approach to artificial grammar learning in that people use abstract information to learn rules for grammars and use concrete, exemplar-specific memory for chunks. Since the amnesiacs were unable to store specific "chunks" in memory, they completed the task using an abstract set of rules. The control group was able to store these specific chunks in memory and (as evidenced by recall) did store these examples in memory for later reference.

Automaticity debate

AGL research has been criticized due to the "automatic question": Is AGL considered to be an automatic process? During encoding (see

encoding (memory) Memory has the ability to encode, Storage (memory), store and Recall (memory), recall information. Memories give an organism the capability to learn and adapt from previous experiences as well as build relationships. Encoding allows a perceived i ...

), performance can be automatic in the sense of occurring without conscious monitoring (without conscious guidance by the performer's intentions). In the case of AGL, it was claimed that implicit learning is an automatic process due to the fact that it is done with no intention of learning a specific grammar rule. This complies with the classic definition of an "automatic process" as a fast, unconscious, effortless process that may start unintentionally. When aroused, it continues until it is over without the ability to stop or ignore its consequences. This definition has been challenged many times. Alternative definitions for automatic process have been given. Reber's presumption that AGL is automatic could be problematic by implying that an unintentional process is an automatic process in its essence. When focusing on AGL tests, a few issues need to be addressed. The process is complex and contains encoding and recall or retrieval. Both encoding and retrieval could be interpreted as automatic processes since what was encoded during the learning stage is not necessary for the task intentionally performed during the test stage. Researchers need to differentiate between implicitness as referring to the process of learning or knowledge encoding and also as referring to performance during the test phase or knowledge retrieval. Knowledge encoded during training may include many aspects of the presented stimuli (whole strings, relations among elements, etc.). The contribution of the various components to performance depends on both the specific instruction in the acquisition phase and the requirements of the retrieval task. Therefore, the instructions on every phase are important in order to determine whether or not each stage will require automatic processing. Each phase should be evaluated for automaticity separately. One hypothesis that contradicts the automaticity of AGL is the "mere exposure effect". The mere exposure effect is increased affect towards a stimulus that is the result of nonreinforced, repeated exposure to the stimulus. Results from over 200 experiments on this effect indicate that there is a positive relationship between mean "goodness" rating and frequency of stimulus exposure. Stimuli for these experiments included line drawings, polygons and nonsense words (which are types of stimuli used in AGL research). These experiments exposed participants to each stimulus up to 25 times. Following each exposure participants were asked to rate the degree to which each stimulus suggested "good" vs. "bad" effect on a 7-point scale. In addition to the main pattern of results, it was also found in several experiments that participants rated higher positive affect for previously exposed items than for novel items. Since

implicit cognition Implicit cognition refers to cognitive processes that occur outside conscious awareness or conscious control. This includes domains such as learning, perception, or memory which may influence a person's behavior without their conscious awareness of ...

should not reference previous study episodes, the effects on affect ratings should not have been observed if processing of this stimuli is truly implicit. The results of these experiments suggests that different categorization of the strings may occur due to differences in affect associated with the strings and not due to implicitly learned grammar rules.

Artificial intelligence

Since the advent of computers and

artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...

, computer programs have been adapted that attempt to simulate the implicit learning process observed in the AGL paradigm. The AI programs first adapted to simulate both natural and artificial grammar learning used the following basic structure: ;Given: A set of grammatical sentences from some language. ;Find: A procedure for recognizing and/or generating all grammatical sentences in that language. An early model for AI grammar learning is Wolff's SNPR System. The program acquires a series of letters with no pauses or punctuation between words and sentences. The program then examines the string in subsets and looks for common sequences of symbols and defines "chunks" in terms of these sequences (these chunks are akin to the exemplar-specific chunks described for AGL). As the model acquires these chunks through exposure, the chunks begin to replace the sequences of unbroken letters. When a chunk precedes or follows a common chunk, then the model determines disjunctive classes in terms of the first set. For example, when the model encounters "the-dog-chased" and "the-cat-chased" it classifies "dog" and "cat" as being members of the same class since they both precede "chase". While the model sorts chunks into classes, it does explicitly define these groups (e.g., noun, verb). Early AI models of grammar learning such as these ignored the importance of negative instances of grammar's effect on grammar acquisition and were also lacking in the ability to connect grammatical rules to

pragmatics In linguistics and the philosophy of language, pragmatics is the study of how Context (linguistics), context contributes to meaning. The field of study evaluates how human language is utilized in social interactions, as well as the relationship ...

and

semantics Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...

. Newer models have attempted to factor these details in. The Unified Model attempts to take both of these factors into account. The model breaks down grammar according to "cues". Languages mark case roles using five possible cue types: word order, case marking, agreement, intonation and verb-based expectation (see

). The influence that each cue has over a language's grammar is determined by its "cue strength" and "cue validity". Both of these values are determined using the same formula, except that cue strength is defined through experimental results and cue validity is defined through corpus counts from language databases. The formula for cue strength/validity is as follows: : Cue strength/cue validity = cue availability * cue reliability Cue availability is the proportion of times that the cue is available over the times that it is needed. Cue reliability is the proportion of times that the cue is correct over the total occurrences of the cue. By incorporating cue reliability along with cue availability, The Unified Model is able to account for the effects of negative instances of grammar since it takes accuracy and not just frequency into account. As a result, this also accounts for the semantic and pragmatic information since cues that do not produce grammar in the appropriate context will have low cue strength and cue validity. While MacWhinney's model also simulates natural grammar learning, it attempts to model the implicit learning processes observed in the AGL paradigm.

Cognitive neuroscience and the AGL paradigm

Contemporary studies with AGL have attempted to identify which structures are involved in the acquisition of grammar and implicit learning. Agrammatic aphasic patients (see

Agrammatism Agrammatism is a characteristic of non-fluent aphasia. Individuals with agrammatism present with speech that is characterized by containing mainly content words, with a lack of function words. For example, when asked to describe a picture of ch ...

) were tested with the AGL paradigm. The results show that breakdown of language in agrammatic aphasia is associated with an impairment in artificial grammar learning, indicating damage to domain-general neural mechanisms sub serving both language and sequential learning. De Vries, Barth, Maiworm, Knecht, Zwitserlood & Flöel found that electrical stimulation of

Broca's area Broca's area, or the Broca area (, also , ), is a region in the frontal lobe of the dominant Cerebral hemisphere, hemisphere, usually the left, of the Human brain, brain with functions linked to speech production. Language processing in the brai ...

enhances implicit learning of an artificial grammar. Direct current stimulation may facilitate acquisition of grammatical knowledge, a finding of potential interest for rehabilitation of aphasia. Petersson, Vasiliki & Hagoort,{{cite journal, last=Petersson, first=K.M. , author2=Vasiliki, F. , author3=Hagoort, P., title=What artificial grammar learning reveals about the neurobiology of syntax, journal=Brain & Language, year=2010, pages=340–353, url=http://pubman.mpdl.mpg.de/pubman/item/escidoc%3A101939%3A11/component/escidoc%3A532052/Petersson_What_Artifical_grammar_Brain_Lang_Corrected%20_Proof.pdf examine the neurobiological correlates of

, the processing of structured sequences, by comparing

fMRI Functional magnetic resonance imaging or functional MRI (fMRI) measures brain activity by detecting changes associated with blood flow. This technique relies on the fact that cerebral blood flow and neuronal activation are coupled. When an area o ...

results on artificial and natural language syntax. They argue that the "

Chomsky hierarchy The Chomsky hierarchy in the fields of formal language theory, computer science, and linguistics, is a containment hierarchy of classes of formal grammars. A formal grammar describes how to form strings from a formal language's alphabet that are v ...

" is not directly relevant for neurobiological systems through AGL testing.

References

Grammar Language acquisition Computational linguistics