Textual Corpus
   HOME
*





Textual Corpus
In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. In search technology, a corpus is the collection of documents which is being searched. Overview A corpus may contain texts in a single language (''monolingual corpus'') or text data in multiple languages (''multilingual corpus''). In order to make the corpora more useful for doing linguistic research, they are often subjected to a process known as annotation. An example of annotating a corpus is part-of-speech tagging, or ''POS-tagging'', in which information about each word's part of speech (verb, noun, adjective, etc.) is added to the corpus in the form of ''tags''. Another example is indicating the lemma (base ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Linguistics
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguistics is concerned with both the cognitive and social aspects of language. It is considered a scientific field as well as an academic discipline; it has been classified as a social science, natural science, cognitive science,Thagard, PaulCognitive Science, The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.). or part of the humanities. Traditional areas of linguistic analysis correspond to phenomena found in human linguistic systems, such as syntax (rules governing the structure of sentences); semantics (meaning); morphology (structure of words); phonetics (speech sounds and equivalent gestures in sign languages); phonology (the abstract sound system of a particular language); and pragmatics (how social con ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Speech Recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the main benefit of searchability. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis. Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The system analyzes the person's specific voice and uses it to fine-tune the recognition of that person's speech, resulting in increased accuracy. Systems that do not use training are called "speaker-independent" systems. Systems that use training are called "speaker dependent". Speech recognition ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

1350 BC
Events and trends * c. 1356 BC – Amenhotep IV begins the worship of Aten in Ancient Egypt, changing his name to Akhenaten and moving the capital to Akhetaten, starting the Amarna Period. * c. 1352 BC – Amenhotep III ( Eighteenth Dynasty of Egypt) dies and is succeeded as Pharaoh by Amenhotep IV. * 1350 BC – Yin becomes the new capital of Shang dynasty The Shang dynasty (), also known as the Yin dynasty (), was a Chinese royal dynasty founded by Tang of Shang (Cheng Tang) that ruled in the Yellow River valley in the second millennium BC, traditionally succeeding the Xia dynasty and ... China. References {{DEFAULTSORT:1350s Bc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Amarna Letters
The Amarna letters (; sometimes referred to as the Amarna correspondence or Amarna tablets, and cited with the abbreviation EA, for "El Amarna") are an archive, written on clay tablets, primarily consisting of diplomatic correspondence between the Ancient Egypt, Egyptian administration and its representatives in Canaan and Amurru kingdom, Amurru, or neighboring kingdom leaders, during the New Kingdom, spanning a period of no more than thirty years between c. 1360ā€“1332 BC (see Amarna letters#Chronology, here for dates).Moran, p.xxxiv The letters were found in Upper Egypt at el-Amarna, the modern name for the ancient Egyptian capital of ''Akhetaten'', founded by pharaoh Akhenaten (1350sā€“1330s BC) during the Eighteenth Dynasty of Egypt. The Amarna letters are unusual in Egyptological research, because they are written not in the language of ancient Egypt, but in cuneiform, the writing system of ancient Mesopotamia. Most are in a variety of Akkadian language, Akkadian sometim ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Biblical Scholarship
Biblical criticism is the use of critical analysis to understand and explain the Bible. During the eighteenth century, when it began as ''historical-biblical criticism,'' it was based on two distinguishing characteristics: (1) the concern to avoid dogma and bias by applying a neutral, non-sectarian, reason-based judgment to the study of the Bible, and (2) the belief that the reconstruction of the historical events behind the texts, as well as the history of how the texts themselves developed, would lead to a correct understanding of the Bible. This sets it apart from earlier, pre-critical methods; from the anti-critical methods of those who oppose criticism-based study; from later post-critical orientation, and from the many different types of criticism which biblical criticism transformed into in the late twentieth and early twenty-first centuries. Most scholars believe the German Enlightenment () led to the creation of biblical criticism, although some assert that its roots ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Decipherment
In philology, decipherment is the discovery of the meaning of texts written in ancient or obscure languages or scripts. Decipherment in cryptography refers to decryption. The term is used sardonically in everyday language to describe attempts to read poor handwriting. In genetics, decipherment is the successful attempt to understand DNA, which is viewed metaphorically as a text containing word-like units.Snustad, D. Peter, et al. (2016). ''Principles of Genetics''. Wiley, p.302 Throughout science the term decipherment is synonymous with the understanding of biological and chemical phenomena. Ancient languages In a few cases, a multilingual artifact has been necessary to facilitate decipherment, the Rosetta Stone being the classic example. Statistical techniques provide another pathway to decipherment, as does the analysis of modern languages derived from ancient languages in which undeciphered texts are written. Archaeological and historical information is helpful in verifying h ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Historical Document
Historical documents are original documents that contain important historical information about a person, place, or event and can thus serve as primary sources as important ingredients of the historical methodology. Significant historical documents can be deeds, laws, accounts of battles (often given by the victors or persons sharing their viewpoint), or the exploits of the powerful. Though these documents are of historical interest, they do not detail the daily lives of ordinary people, or the way society functioned. Anthropologists, historians and archeologists generally are more interested in documents that describe the day-to-day lives of ordinary people, indicating what they ate, their interaction with other members of their households and social groups, and their states of mind. It is this information that allows them to try to understand and describe the way society was functioning at any particular time in history. Greek ostraka provide good examples of historical docum ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Philology
Philology () is the study of language in oral and writing, written historical sources; it is the intersection of textual criticism, literary criticism, history, and linguistics (with especially strong ties to etymology). Philology is also defined as the study of literary texts as well as oral and written records, the establishment of their authenticity and their original form, and the determination of their meaning. A person who pursues this kind of study is known as a philologist. In older usage, especially British, philology is more general, covering comparative linguistics, comparative and historical linguistics. Classical philology studies classical languages. Classical philology principally originated from the Library of Pergamum and the Library of Alexandria around the fourth century BC, continued by Greeks and Romans throughout the Roman Empire, Roman/Byzantine Empire. It was eventually resumed by European scholars of the Renaissance humanism, Renaissance, where it was s ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Parallel Corpora
A parallel text is a text placed alongside its translation or translations. Parallel text alignment is the identification of the corresponding sentences in both halves of the parallel text. The Loeb Classical Library and the Clay Sanskrit Library are two examples of dual-language series of texts. Reference Bibles may contain the original languages and a translation, or several translations by themselves, for ease of comparison and study; Origen's Hexapla (Greek for "sixfold") placed six versions of the Old Testament side by side. A famous example is the Rosetta Stone, whose discovery allowed the Ancient Egyptian language to begin being deciphered. Large collections of parallel texts are called parallel corpora (see text corpus). Alignments of parallel corpora at sentence level are prerequisite for many areas of linguistic research. During translation, sentences can be split, merged, deleted, inserted or reordered by the translator. This makes alignment a non-trivial task. P ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. On a basic level, MT performs mechanical substitution of words in one language for words in another, but that alone rarely produces a good translation because recognition of whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalent words in another language, and many words have more than one meaning. Solving this problem with corpus statistical and neural techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies. Current machine translation software often allows for customizat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Foreign Language Writing Aid
A foreign language writing aid is a computer program or any other instrument that assists a non-native language user (also referred to as a foreign language learner) in writing decently in their target language. Assistive operations can be classified into two categories: on-the-fly prompts and post-writing checks. Assisted aspects of writing include: lexical, syntactic (syntactic and semantic roles of a word's frame), lexical semantic (context/collocation-influenced word choice and user-intention-driven synonym choice) and idiomatic expression transfer, etc. Different types of foreign language writing aids include automated proofreading applications, text corpora, dictionaries, translation aids and orthography aids. Background The four major components in the acquisition of a language are namely; listening, speaking, reading and writing.Gregersen, T.S. (2003) To Err Is Human: A Reminder to Teachers of Language-Anxious Students. ''Foreign Language Annals, 36''(1): 25-32. While most ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Language Teaching
Language education ā€“ the process and practice of teaching a second or foreign language ā€“ is primarily a branch of applied linguistics, but can be an interdisciplinary field. There are four main learning categories for language education: communicative competencies, proficiencies, cross-cultural experiences, and multiple literacies. Need Increasing globalization has created a great need for people in the workforce who can communicate in multiple languages. Common languages are used in areas such as trade, tourism, diplomacy, technology, media, translation, interpretation and science. Many countries such as Korea (Kim Yeong-seo, 2009), Japan (Kubota, 1998) and China (Kirkpatrick & Zhichang, 2002) frame education policies to teach at least one foreign language at the primary and secondary school levels. However, some countries such as India, Singapore, Malaysia, Pakistan, and the Philippines use a second official language in their governments. According to GAO (2010), China ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]