List of unsolved problems in linguistics
   HOME

TheInfoList



OR:

This article discusses notable
unsolved problems List of unsolved problems may refer to several notable conjectures or open problems in various academic fields: Natural sciences, engineering and medicine * Unsolved problems in astronomy * Unsolved problems in biology * Unsolved problems in c ...
in
linguistics Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguis ...
. Some of the issues below are commonly recognized as unsolved problems; i.e., it is generally agreed that no solution is known. Others may be described as
controversies Controversy is a state of prolonged public dispute or debate, usually concerning a matter of conflicting opinion or point of view. The word was coined from the Latin ''controversia'', as a composite of ''controversus'' – "turned in an opposite d ...
; i.e., although there is no common agreement about the answer, there are established schools of thought that believe they have a correct answer.


Concepts

* Is there a universal definition of ''
word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
''? * Is there a universal definition of ''
syllable A syllable is a unit of organization for a sequence of speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered the phonological "bu ...
''? * Is there a universal definition of '' sentence''? * Are there any universal
grammatical categories In linguistics, a grammatical category or grammatical feature is a property of items within the grammar of a language. Within each category there are two or more possible values (sometimes called grammemes), which are normally mutually exclusiv ...
? * Is syntactic structure constructed of part-whole relations of syntactic constituents or is it built of an asymmetrical
dependency relation In computer science, in particular in concurrency (computer science), concurrency theory, a dependency relation is a binary relation on a finite domain \Sigma, symmetric relation, symmetric, and reflexive relation, reflexive; i.e. a finite toleran ...
between words? * Can the elements contained in words (
morphemes A morpheme is the smallest meaningful constituent of a linguistic expression. The field of linguistic study dedicated to morphemes is called morphology. In English, morphemes are often but not necessarily words. Morphemes that stand alone a ...
) and the elements contained in sentences (words or syntactic constituents) be shown to follow the same principles of combination? * How are domains for phonological processes related to syntactic structure? Do prosodic domains deviate from syntactic constituent structure? * Is it possible to formally delineate languages from each other? That is to say, is it possible to use linguistic (rather than social) criteria to draw a clear boundary between two closely related languages with a
dialect continuum A dialect continuum or dialect chain is a series of Variety (linguistics), language varieties spoken across some geographical area such that neighboring varieties are Mutual intelligibility, mutually intelligible, but the differences accumulat ...
between their respective standard forms (e.g.
Occitan Occitan may refer to: * Something of, from, or related to the Occitania territory in parts of France, Italy, Monaco and Spain. * Something of, from, or related to the Occitania administrative region of France. * Occitan language Occitan (; o ...
and
Catalan Catalan may refer to: Catalonia From, or related to Catalonia: * Catalan language, a Romance language * Catalans, an ethnic group formed by the people from, or with origins in, Northern or southern Catalonia Places * 13178 Catalan, asteroid #1 ...
)? * How does
grammaticalization In historical linguistics, grammaticalization (also known as grammatization or grammaticization) is a process of language change by which words representing objects and actions (i.e. nouns and verbs) become grammatical markers (such as affixes or p ...
function? * What constitutes grammatical language, as viewed by native speakers of that particular language, i.e. the problem of gradient well-formedness? * Is there one universal process with which the evolution of
creole language A creole language, or simply creole, is a stable natural language that develops from the simplifying and mixing of different languages into a new one within a fairly brief period of time: often, a pidgin evolved into a full-fledged language. ...
s can be tracked? * How does
lexical substitution Lexical substitution is the task of identifying a substitute for a word in the context of a clause. For instance, given the following text: "After the ''match'', replace any remaining fluid deficit to prevent chronic dehydration throughout the tour ...
function given the potentially limitless number of different contexts, the limits of one's knowledge and the limits of one's understanding and usage of language? * How do
idiolect Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation. This differs from a dialect, a common set of linguistic characteristics shared among a group of people. Th ...
s and
dialect The term dialect (from Latin , , from the Ancient Greek word , 'discourse', from , 'through' and , 'I speak') can refer to either of two distinctly different types of Linguistics, linguistic phenomena: One usage refers to a variety (linguisti ...
s emerge? Are there any common patterns in their development? Can they be quantitatively and qualitatively measured at all and if so, how?


Philosophy of language

*What is language? **Is the basic structure of language an innate
Universal Grammar Universal grammar (UG), in modern linguistics, is the theory of the genetic component of the language faculty, usually credited to Noam Chomsky. The basic postulate of UG is that there are innate constraints on what the grammar of a possible hum ...
or is it a socially learned behavior structured by the functions to which language is put in human interaction? * How do
intension In any of several fields of study that treat the use of signs — for example, in linguistics, logic, mathematics, semantics, semiotics, and philosophy of language — an intension is any property or quality connoted by a word, phrase, or anoth ...
, comprehension,
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a ''name'' ...
, intention and intentionality,
extension Extension, extend or extended may refer to: Mathematics Logic or set theory * Axiom of extensionality * Extensible cardinal * Extension (model theory) * Extension (predicate logic), the set of tuples of values that satisfy the predicate * E ...
,
linguistic relativity The hypothesis of linguistic relativity, also known as the Sapir–Whorf hypothesis , the Whorf hypothesis, or Whorfianism, is a principle suggesting that the structure of a language affects its speakers' world view, worldview or cognition, and ...
,
context Context may refer to: * Context (language use), the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary Computing * Context (computing), the virtual environment required to su ...
,
ambiguity Ambiguity is the type of meaning in which a phrase, statement or resolution is not explicitly defined, making several interpretations plausible. A common aspect of ambiguity is uncertainty. It is thus an attribute of any idea or statement ...
,
polysemy Polysemy ( or ; ) is the capacity for a sign (e.g. a symbol, a morpheme, a word, or a phrase) to have multiple related meanings. For example, a word can have several word senses. Polysemy is distinct from ''monosemy'', where a word has a singl ...
,
idiolect Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation. This differs from a dialect, a common set of linguistic characteristics shared among a group of people. Th ...
,
dialect The term dialect (from Latin , , from the Ancient Greek word , 'discourse', from , 'through' and , 'I speak') can refer to either of two distinctly different types of Linguistics, linguistic phenomena: One usage refers to a variety (linguisti ...
, among other major linguistics concepts, interplay to give rise to meaningful language as spoken or written by an individual?


Historical linguistics and the evolution of language


The evolution of language

* How and when did language originate? ** How and when did different modes of language ( spoken, signed,
written Writing is a medium of human communication which involves the representation of a language through a system of physically inscribed, mechanically transferred, or digitally represented symbols. Writing systems do not themselves constitute h ...
) originate? ** Were ''Homo sapiens '' the first humans to use language? What about other species in the genus ''
Homo ''Homo'' () is the genus that emerged in the (otherwise extinct) genus ''Australopithecus'' that encompasses the extant species ''Homo sapiens'' ( modern humans), plus several extinct species classified as either ancestral to or closely relate ...
''? ** Is language continuous or discontinuous with earlier forms of communication? Did language appear suddenly or gradually?


Language classification

* What language families are valid? ** Are any macro-families valid? * Can any of the approximately 100
unclassified languages An unclassified language is a language whose Genetic relationship (linguistics), genetic affiliation to other languages has not been established. Languages can be unclassified for a variety of reasons, mostly due to a lack of reliable data but s ...
be classified? Or does our limited knowledge of them prevent that? * Can we decipher any of the extant
undeciphered writing systems An undeciphered writing system is a written form of language that is not currently understood. Many undeciphered writing systems date from several thousand years BC, though some more modern examples do exist. The term " writing systems" is use ...
? *
Language isolates Language isolates are languages that cannot be classified into larger language families. Korean and Basque are two of the most common examples. Other language isolates include Ainu in Asia, Sandawe in Africa, and Haida in North America. The numbe ...
have no demonstrated relatives, and essentially form language families on their own. Can any of the approximately 159 language isolates be shown to be related to other languages? * Can we use the
comparative method In linguistics, the comparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor and then extrapolating backwards t ...
to reconstruct back to an arbitrary time depth, or do we need new methods to reconstruct the distant past of languages? Is there a time depth beyond which we cannot reconstruct? * Can we ever demonstrate that all languages are ultimately related to each other, or that they aren't?


Psycholinguistics

* Language emergence: ** Emergence of
grammar In linguistics, the grammar of a natural language is its set of structure, structural constraints on speakers' or writers' composition of clause (linguistics), clauses, phrases, and words. The term can also refer to the study of such constraint ...
*
Language acquisition Language acquisition is the process by which humans acquire the capacity to perceive and comprehend language (in other words, gain the ability to be aware of language and to understand it), as well as to produce and use words and sentences to ...
: ** Controversy: infant
language acquisition Language acquisition is the process by which humans acquire the capacity to perceive and comprehend language (in other words, gain the ability to be aware of language and to understand it), as well as to produce and use words and sentences to ...
/first-language acquisition. How are infants able to learn language? One line of debate is between two points of view: that of
psychological nativism In the field of psychology, nativism is the view that certain skills or abilities are "native" or hard-wired into the brain at birth. This is in contrast to the "blank slate" or tabula rasa view, which states that the brain has inborn capabilities ...
, i.e., the language ability is somehow "hardwired" in the human brain, and usage based theories of language, according to which language emerges through to brain's interaction with environment and activated by general dispositions for social interaction and communication, abstract symbolic thought and pattern recognition and inference. ** Is the human ability to use syntax based on innate mental structures or is syntactic speech the function of intelligence and interaction with other humans? The question is closely related to those of language emergence and acquisition. ** Is there a
language acquisition device The Language Acquisition Device (LAD) is a claim from language acquisition research proposed by Noam Chomsky in the 1960s. The LAD concept is a purported instinctive mental capacity which enables an infant to acquire and produce language. It is ...
: How localized is language in the brain? Is there a particular area in the brain responsible for the development of language abilities or is it only partially localized? ** What fundamental reasons explain why ultimate attainment in
second-language acquisition Second-language acquisition (SLA), sometimes called second-language learning — otherwise referred to as L2 (language 2) acquisition, is the process by which people learn a second language. Second-language acquisition is also the scientific dis ...
is typically some way short of the
native speaker Native Speaker may refer to: * ''Native Speaker'' (novel), a 1995 novel by Chang-Rae Lee * ''Native Speaker'' (album), a 2011 album by Canadian band Braids * Native speaker, a person using their first language or mother tongue {{disambigua ...
's ability, with learners varying widely in performance? ** What are the optimal ways to achieve successful
second-language acquisition Second-language acquisition (SLA), sometimes called second-language learning — otherwise referred to as L2 (language 2) acquisition, is the process by which people learn a second language. Second-language acquisition is also the scientific dis ...
? ** Animals and language: How much human language can animals be taught to use? How much of animal communication can be said to have the same properties as human language (e.g.
compositionality In semantics, mathematical logic and related disciplines, the principle of compositionality is the principle that the meaning of a complex expression is determined by the meanings of its constituent expressions and the rules used to combine them. ...
of bird calls as
syntax In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure ( constituency) ...
)? ** What role does linguistic
intuition Intuition is the ability to acquire knowledge without recourse to conscious reasoning. Different fields use the word "intuition" in very different ways, including but not limited to: direct access to unconscious knowledge; unconscious cognition; ...
play, how is it formed and how does it function? Is it closely linked to exposure to a unique set of different experiences and their contexts throughout one's personal life? *
Linguistic relativity The hypothesis of linguistic relativity, also known as the Sapir–Whorf hypothesis , the Whorf hypothesis, or Whorfianism, is a principle suggesting that the structure of a language affects its speakers' world view, worldview or cognition, and ...
: What are the relations between grammatical patterns and cognitive habits of speakers of different languages? Does language use train or habituate speakers to certain cognitive habits that differ between speakers of different languages? Are effects of linguistic relativity caused by grammar structures or by cultural differences that underlie differences in language use.


Sociolinguistics

* How to deal with variation in language (including
idiolect Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation. This differs from a dialect, a common set of linguistic characteristics shared among a group of people. Th ...
s,
dialect The term dialect (from Latin , , from the Ancient Greek word , 'discourse', from , 'through' and , 'I speak') can refer to either of two distinctly different types of Linguistics, linguistic phenomena: One usage refers to a variety (linguisti ...
s,
sociolect In sociolinguistics, a sociolect is a form of language ( non-standard dialect, restricted register) or a set of lexical items used by a socioeconomic class, profession, an age group, or other social group. Sociolects involve both passive acquisi ...
s,
jargon Jargon is the specialized terminology associated with a particular field or area of activity. Jargon is normally employed in a particular Context (language use), communicative context and may not be well understood outside that context. The conte ...
s,
argot A cant is the jargon or language of a group, often employed to exclude or mislead people outside the group.McArthur, T. (ed.) ''The Oxford Companion to the English Language'' (1992) Oxford University Press It may also be called a cryptolect, argot ...
s, etc.) to achieve effective and successful communication between individuals and between groups, i.e. what are the best ways to ensure efficient communication without misunderstandings: in everyday life and in educational, scientific and philosophical discussions? * What are the best ways to quantitatively and qualitatively compare language use between individuals and between groups? * How does
time Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, to ...
(and the
semantic change Semantic change (also semantic shift, semantic progression, semantic development, or semantic drift) is a form of language change regarding the evolution of word usage—usually to the point that the modern meaning is radically different from ...
that it brings) and physical age influence language use? * What causes linguistic features to begin to undergo
language change Language change is variation over time in a language's features. It is studied in several subfields of linguistics: historical linguistics, sociolinguistics, and evolutionary linguistics. Traditional theories of historical linguistics identify ...
at some points in time and in some dialects but not others? (This is known as the "actuation problem".)


Computational linguistics

* Is perfect computational
word-sense disambiguation Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to consc ...
attainable by using
software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
? If yes, how and why? If no, why? (This presupposes the solution to the unsolved problems in the other areas of linguistics as a basis.) * Is accurate computational
word-sense induction In computational linguistics, word-sense induction (WSI) or discrimination is an open problem of natural language processing, which concerns the automatic identification of the senses of a word (i.e. meanings). Given that the output of word-sens ...
feasible? If yes, how and why? If not, why?


Lexicology and lexicography

* What makes a good
dictionary A dictionary is a listing of lexemes from the lexicon of one or more specific languages, often arranged alphabetically (or by radical and stroke for ideographic languages), which may include information on definitions, usage, etymologies ...
? * To what extent are dictionaries reliable in terms of their supposed universality when
spoken language A spoken language is a language produced by articulate sounds or (depending on one's definition) manual gestures, as opposed to a written language. An oral language or vocal language is a language produced with the vocal tract in contrast with a si ...
is constantly changing (
semantic change Semantic change (also semantic shift, semantic progression, semantic development, or semantic drift) is a form of language change regarding the evolution of word usage—usually to the point that the modern meaning is radically different from ...
, semantic extension,
semantic compression In natural language processing, semantic compression is a process of compacting a lexicon used to build a textual document (or a set of documents) by reducing language heterogeneity, while maintaining text semantics. As a result, the same ideas ca ...
, etc.)? * What are good practices to avoid
circular definition A circular definition is a description that uses the term(s) being defined as part of the description or assumes that the term(s) being described are already known. There are several kinds of circular definition, and several ways of characteris ...
s in dictionaries? Is it possible to eliminate them at all, given the
vagueness In linguistics and philosophy, a vague predicate is one which gives rise to borderline cases. For example, the English adjective "tall" is vague since it is not clearly true or false for someone of middling height. By contrast, the word "prime" is ...
,
polysemy Polysemy ( or ; ) is the capacity for a sign (e.g. a symbol, a morpheme, a word, or a phrase) to have multiple related meanings. For example, a word can have several word senses. Polysemy is distinct from ''monosemy'', where a word has a singl ...
, etc. in all languages? * What are the best ways to ensure efficient communication without misunderstandings: in everyday life and in educational, scientific and philosophical discussions? Is total terminology standardization attainable at all? If yes, does it involve the mass use of freely available and easily accessible terminology databases? * To what extent are termbases reliable and can their reliability be measured objectively? If yes, how and why? If no, why? What is the relationship between termbases and individual
subjectivity Subjectivity in a philosophical context has to do with a lack of objective reality. Subjectivity has been given various and ambiguous definitions by differing sources as it is not often the focal point of philosophical discourse.Bykova, Marina F ...
and can subjectivity about
word sense In linguistics, a word sense is one of the meanings of a word. For example, a dictionary may have over 50 different senses of the word "play", each of these having a different meaning based on the context of the word's usage in a sentence, as fo ...
disambiguation be overcome at all or is it a natural result of different experiences in one's unique
personal life Personal life is the course or state of an individual's life, especially when viewed as the sum of personal choices contributing to one's personal identity. Apart from hunter-gatherers, most pre-modern peoples' time was limited by the need to ...
? * How to successfully reduce
lexicographic error A lexicographic error is an inaccurate entry in a dictionary. Such problems, because they undercut the intention of providing authoritative guidance to readers and writers, attract special attention. An early English-language example was the def ...
s and
lexicographic information cost Lexicographic information cost is a concept within the field of lexicography. The term refers to the difficulties and inconveniences that the user of a dictionary believes or feels are associated with consulting a particular dictionary or dictionar ...
s?


Translation

* Is there an objective gauge for the quality of
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
?Robert Spence, "A Functional Approach to Translation Studies. New systemic linguistic challenges in empirically informed didactics", 2004, , thesis
A pdf file
* What are the best strategies for quality translation: fidelity or transparency, dynamic or formal equivalence, etc.? * What are the best ways to deal with
untranslatability Untranslatability is the property of text or speech for which no equivalent can be found when translated into another language. A text that is considered to be untranslatable is considered a ''lacuna'', or lexical gap. The term arises when descr ...
, e.g.
lexical gaps In linguistics an accidental gap, also known as a gap, paradigm gap, accidental lexical gap, lexical gap, lacuna, or hole in the pattern, is a potential word, word sense, morpheme, or other form that does not exist in some language despite being t ...
? * How to best deal with translation loss and its accumulation, e.g. when translating from a translation (see
Chinese whispers Chinese whispers (some Commonwealth English) or telephone (American English and Canadian English) is an internationally popular children's game. It is also called transmission chain experiments in the context of cultural evolution research, and ...
)? * Can
machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
s ever achieve the high degree of comprehensibility and quality of translations translated by a good professional human translator? * What are the effective ways to achieve proper
localization and internationalization In computing, internationalization and localization (American) or internationalisation and localisation (British English), often abbreviated i18n and L10n, are means of adapting computer software to different languages, regional peculiarities and ...
?


Other

* Is there an objective way to determine which are the
most difficult language Second-language acquisition (SLA), sometimes called second-language learning — otherwise referred to as L2 (language 2) acquisition, is the process by which people learn a second language. Second-language acquisition is also the scientific dis ...
s? * To what extent are
conlang A constructed language (sometimes called a conlang) is a language whose phonology, grammar, and vocabulary, instead of having developed naturally, are consciously devised for some purpose, which may include being devised for a work of fiction. ...
s usable and useful as used as natural languages by humans?


References


External links

* {{DEFAULTSORT:Linguistics, Unsolved problems
Unsolved problems List of unsolved problems may refer to several notable conjectures or open problems in various academic fields: Natural sciences, engineering and medicine * Unsolved problems in astronomy * Unsolved problems in biology * Unsolved problems in c ...
Lists of unsolved problems
Unsolved problems List of unsolved problems may refer to several notable conjectures or open problems in various academic fields: Natural sciences, engineering and medicine * Unsolved problems in astronomy * Unsolved problems in biology * Unsolved problems in c ...