Concordancer
   HOME
*





Concordancer
A concordancer is a computer program that automatically constructs a concordance. The output of a concordancer may serve as input to a translation memory system for computer-assisted translation, or as an early step in machine translation. Concordancers are also used in corpus linguistics to retrieve alphabetically or otherwise sorted lists of linguistic data from the corpus in question, which the corpus linguist then analyzes. A number of concordancers have been published notably Oxford Concordance Program (OCP), a concordancer first released in 1981 by Oxford University Computing Services claims to be used in over 200 organisations worldwide.
The Oxford Concordance Program Version 2 S. Hockey J. Martin Literary and Linguistic Computing, Volume 2, Issue 2, 1 January 1987, Pages 125–131, https://doi.org/10.1093/llc/2.2.1 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Computer-assisted Translation
Computer-aided translation (CAT), also referred to as computer-assisted translation or computer-aided human translation (CAHT), is the use of software to assist a human translator in the translation process. The translation is created by a human, and certain aspects of the process are facilitated by software; this is in contrast with machine translation (MT), in which the translation is created by a computer, optionally with some human intervention (e.g. pre-editing and post-editing). CAT tools are typically understood to mean programs that specifically facilitate the actual translation process. Most CAT tools have (a) the ability to translate a variety of source file formats in a single editing environment without needing to use the file format's associated software for most or all of the translation process, (b) translation memory, and (c) integration of various utilities or processes that increase productivity and consistency in translation. Range of tools Computer-assis ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Key Word In Context
Key Word In Context (KWIC) is the most common format for concordance lines. The term KWIC was first coined by Hans Peter Luhn. The system was based on a concept called ''keyword in titles'' which was first proposed for Manchester libraries in 1864 by Andrea Crestadoro. A KWIC index is formed by sorting and aligning the words within an article title to allow each word (except the stop words) in titles to be searchable alphabetically in the index. It was a useful indexing method for technical manuals before computerized full text search became common. For example, a search query including all of the words in an example definition ("KWIC is an acronym for Key Word In Context, the most common format for concordance lines") and the Wikipedia slogan in English ("the free encyclopedia"), searched against a Wikipedia page, might yield a KWIC index as follows. A KWIC index usually uses a wide layout to allow the display of maximum 'in context' information (not shown in the following example ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computer Program
A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components. A computer program in its human-readable form is called source code. Source code needs another computer program to execute because computers can only execute their native machine instructions. Therefore, source code may be translated to machine instructions using the language's compiler. ( Assembly language programs are translated using an assembler.) The resulting file is called an executable. Alternatively, source code may execute within the language's interpreter. If the executable is requested for execution, then the operating system loads it into memory and starts a process. The central processing unit will soon switch to this process so it can fetch, decode, and then execute each machine instruction. If the source code is requested for execution, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Concordance (publishing)
A concordance is an alphabetical list of the principal words used in a book or body of work, listing every instance of each word with its immediate context. Concordances have been compiled only for works of special importance, such as the Vedas, Bible, Qur'an or the works of Shakespeare, James Joyce or classical Latin and Greek authors, because of the time, difficulty, and expense involved in creating a concordance in the pre-computer era. A concordance is more than an index, with additional material such as commentary, definitions and topical cross-indexing which makes producing one a labor-intensive process even when assisted by computers. In the precomputing era, search technology was unavailable, and a concordance offered readers of long works such as the Bible something comparable to search results for every word that they would have been likely to search for. Today, the ability to combine the result of queries concerning multiple terms (such as searching for words near ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Translation Memory
A translation memory (TM) is a database that stores "segments", which can be sentences, paragraphs or sentence-like units (headings, titles or elements in a list) that have previously been translated, in order to aid human translators. The translation memory stores the source text and its corresponding translation in language pairs called “translation units”. Individual words are handled by terminology bases and are not within the domain of TM. Software programs that use translation memories are sometimes known as translation memory managers (TMM) or translation memory systems (TM systems, not to be confused with a Translation management system (TMS), which is another type of software focused on managing process of translation). Translation memories are typically used in conjunction with a dedicated computer assisted translation (CAT) tool, word processing program, terminology management systems, multilingual dictionary, or even raw machine translation output. Research indicat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. On a basic level, MT performs mechanical substitution of words in one language for words in another, but that alone rarely produces a good translation because recognition of whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalent words in another language, and many words have more than one meaning. Solving this problem with corpus statistical and neural techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies. Current machine translation software often allows for customizat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Corpus Linguistics
Corpus linguistics is the study of language, study of a language as that language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. The text-corpus method uses the body of texts written in any natural language to derive the set of abstract rules which govern that language. Those results can be used to explore the relationships between that subject language and other languages which have undergone a similar analysis. The first such corpora were manually derived from source texts, but now that work is automated. Corpora have not only been used for linguistics research, they have also been used to compile dictionaries (starting with ''The American Heritage Dictionary of the English Language'' in 1969) and grammar guides, such as ''A Compreh ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Oxford Concordance Program
The Oxford Concordance Program (OCP) was first released in 1981 and was a result of a project started in 1978 by Oxford University Computing Services (OUCS) to create a machine independent text analysis program for producing word lists, indexes and concordances in a variety of languages and alphabets. In the 1980s it was claimed to have been licensed to around 240 institutions in 23 countries. History OCP was designed and written in FORTRAN by Susan Hockey and Ian Marriott of Oxford University Computing Services in the period 1979–1980 and its authors acknowledged that it owed much to the earlier COCOA and CLOC (University of Birmingham) concordance systems. During 1985–86 OCP was completely rewritten as version 2 to increase the efficiency of the program, a version was also produced for the IBM PC called Micro-OCP.
The ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Oxford University Computing Services
Oxford University Computing Services (OUCS) until 2012 provided the central Information Technology services for the University of Oxford. The service was based at 7-19 Banbury Road in central north Oxford, England, near the junction with Keble Road. OUCS became part of IT Services, when the new department was created at the University of Oxford on 1 August 2012 through a merger of the three previous central IT departments: Oxford University Computing Services (OUCS), Business Services and Projects (BSP) and ICT Support Team (ICTST). At the time when Oxford University Computing Services ceased to operate as an independent department, it offered facilities, training and advice to members of the university in all aspects of academic computing. OUCS was responsible for the core networks reaching all departments and colleges of Oxford University. OUCS was made up of 5 technical and one administration group. Each group had responsibility for different aspects of OUCS services supplied t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


COCOA (digital Humanities)
COCOA (an acronym derived from COunt and COncordance Generation on Atlas) was an early text file utility and associated file format for digital humanities, then known as humanities computing. It was approximately 4000 punched cards of FORTRAN and created in the late 1960s and early 1970s at University College London and the Atlas Computer Laboratory in Harwell, Oxfordshire. Functionality included word-counting and concordance building. Oxford Concordance Program The Oxford Concordance Program (OCP) format was a direct descendant of COCOA developed at Oxford University Computing Services. The Oxford Text Archive holds items in this format. Later developments The COCOA file format bears at least a passing similarity to the later markup languages such as SGML and XML. A noticeable difference with its successors is that COCOA tags are flat and not tree structured. In that format, every information type and value encoded by a tag should be considered true until the same tag chan ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Cross-reference
The term cross-reference (abbreviation: xref) can refer to either: * An instance within a document which refers to related information elsewhere in the same document. In both printed and online dictionaries cross-references are important because they form a network structure of relations existing between different parts of data, dictionary-internal as well as dictionary external. * In an index, a cross-reference is often denoted by ''See also''. For example, under the term ''Albert Einstein'' in the index of a book about Nobel Laureates, there may be the cross-reference ''See also: Einstein, Albert''. * In hypertext, cross-referencing is maintained to a document with either in-context (XRIC) or out-of-context (''XROC'') cross-referencing. These are similar to KWIC and KWOC. * In programming, "cross-referencing" means the listing of every file name and line number where a given named identifier occurs within the program's source tree. * In a relational database management sy ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Ctags
Ctags is a programming tool that generates an index (or tag) file of names found in source and header files of various programming languages to aid code comprehension. Depending on the language, functions, variables, class members, macros and so on may be indexed. These tags allow definitions to be quickly and easily located by a text editor, a code search engine, or other utility. Alternatively, there is also an output mode that generates a cross reference file, listing information about various names found in a set of language files in human-readable form. The original Ctags was introduced in BSD Unix 3.0 and was written by Ken Arnold, with Fortran support by Jim Kleckner and Pascal support by Bill Joy. It is part of the initial release of Single Unix Specification and XPG4 of 1992. Editors that support ctags ''Tag index files'' are supported by many source code editors, including: * Atom * BBEdit 8+ * CodeLite (via built-in ctagsd language server) * Cloud9 IDE (uses i ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]