Xaira
   HOME
*





Xaira
Xaira is an XML Aware Indexing and Retrieval Architecture developed at Oxford University, it was funded by the Mellon Foundation between 2005 and 2006. It is based on SARA,
How to search the BNC using SARA an Standard Generalized Markup Language, SGML-aware text-searching system originally developed for searching the . Xaira has been redeveloped as a generic system for constructing query-systems for any kind of XML data, in particular for use wit ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lou Burnard
Lou Burnard (born 1946 in Birmingham, England) is an internationally recognised expert in digital humanities, particularly in the area of Markup language, text encoding and digital libraries. He was assistant director of Oxford University Computing Services (OUCS) from 2001 to September 2010 where he officially retired from OUCS. Prior to that, he was manager of the Humanities Computing Unit at OUCS for five years. He has worked in ICT support for research in the humanities since the 1990s. He was one of the founding editors of the Text Encoding Initiative (TEI) and continues to play an active part in its maintenance and development, as a consultant to the TEI Technical Council and as an elected TEI board member. He has played a key role in the establishment of many other key activities and initiatives in this area, such as the UK Arts and Humanities Data Service, and the British National Corpus and has published and lectured widely. Since 2008 he has also worked as a Member of the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

British National Corpus
The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British English of the late 20th century from a wide variety of genres, with the intention that it be a representative sample of spoken and written British English of that time. It is used in corpus linguistic for analysis of corpora History The project to create the BNC involved the collaboration of three publishers (with the Oxford University Press as the lead collaborator, Longman and W. & R. Chambers), two universities (the University of Oxford and Lancaster University), and the British Library. The creation of the BNC started in 1991 under the management of the BNC consortium, and the project was finished by 1994. There have been no additions of new samples after 1994, but the BNC underwent slight revisions before the release of the second edition BNC World (2001) and the third edition BNC XML Edition (2007).
[...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Andrew W
Andrew is the English form of a given name common in many countries. In the 1990s, it was among the top ten most popular names given to boys in English-speaking countries. "Andrew" is frequently shortened to "Andy" or "Drew". The word is derived from the el, Ἀνδρέας, ''Andreas'', itself related to grc, ἀνήρ/ἀνδρός ''aner/andros'', "man" (as opposed to "woman"), thus meaning "manly" and, as consequence, "brave", "strong", "courageous", and "warrior". In the King James Bible, the Greek "Ἀνδρέας" is translated as Andrew. Popularity Australia In 2000, the name Andrew was the second most popular name in Australia. In 1999, it was the 19th most common name, while in 1940, it was the 31st most common name. Andrew was the first most popular name given to boys in the Northern Territory in 2003 to 2015 and continuing. In Victoria, Andrew was the first most popular name for a boy in the 1970s. Canada Andrew was the 20th most popular name chosen for mal ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Standard Generalized Markup Language
The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates": * Declarative: Markup should describe a document's structure and other attributes rather than specify the processing that needs to be performed, because it is less likely to conflict with future developments. * Rigorous: In order to allow markup to take advantage of the techniques available for processing, markup should rigorously define objects like programs and databases. DocBook SGML and LinuxDoc are examples which used SGML tools. Standard versions SGML is an ISO standard: "ISO 8879:1986 Information processing – Text and office systems – Standard Generalized Markup Language (SGML)", of which there are three versions: * Original ''SGML'', which was accepted in October 1986, followed by a minor Technical Corrigendum. * ''SGML (ENR)'', in 1996, resul ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Text Encoding Initiative
The Text Encoding Initiative (TEI) is a text-centric community of practice in the academic field of digital humanities, operating continuously since the 1980s. The community currently runs a mailing list, meetings and conference series, and maintains the TEI technical standard, a journal, a wiki, a GitHub repository and a toolchain. TEI guidelines The ''TEI Guidelines'' collectively define a type of XML format, and are the defining output of the community of practice. The format differs from other well-known open formats for text (such as HTML and OpenDocument) in that it is primarily semantic rather than presentational; the semantics and interpretation of every tag and attribute are specified. There are some 500 different textual components and concepts (, , , , , etc.); each is grounded in one or more academic disciplines and examples are given. Technical details The standard is split into two parts, a discursive textual description with extended examples and discussion a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


XML-RPC
XML-RPC is a remote procedure call (RPC) protocol which uses XML to encode its calls and HTTP as a transport mechanism.Simon St. Laurent, Joe Johnston, Edd Dumbill. (June 2001) ''Programming Web Services with XML-RPC.'' O'Reilly. First Edition. History The XML-RPC protocol was created in 1998 by Dave Winer of UserLand Software and Microsoft, with Microsoft seeing the protocol as an essential part of scaling up its efforts in business-to-business e-commerce. As new functionality was introduced, the standard evolved into what is now SOAP. UserLand supported XML-RPC from version 5.1 of its Frontier web content management system, released in June 1998. XML-RPC's idea of a human-readable-and-writable, script-parsable standard for HTTP-based requests and responses has also been implemented in competing specifications such as Allaire's Web Distributed Data Exchange (WDDX) and webMethod's Web Interface Definition Language (WIDL). Prior art wrapping COM, CORBA, and Java RMI objects i ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

SOAP
Soap is a salt of a fatty acid used in a variety of cleansing and lubricating products. In a domestic setting, soaps are surfactants usually used for washing, bathing, and other types of housekeeping. In industrial settings, soaps are used as thickeners, components of some lubricants, and precursors to catalysts. When used for cleaning, soap solubilizes particles and grime, which can then be separated from the article being cleaned. In hand washing, as a surfactant, when lathered with a little water, soap kills microorganisms by disorganizing their membrane lipid bilayer and denaturing their proteins. It also emulsifies oils, enabling them to be carried away by running water. Soap is created by mixing fats and oils with a base. A similar process is used for making detergent which is also created by combining chemical compounds in a mixer. Humans have used soap for millennia. Evidence exists for the production of soap-like materials in ancient Babylon around 2800 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Corpus Linguistics
Corpus linguistics is the study of language, study of a language as that language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. The text-corpus method uses the body of texts written in any natural language to derive the set of abstract rules which govern that language. Those results can be used to explore the relationships between that subject language and other languages which have undergone a similar analysis. The first such corpora were manually derived from source texts, but now that work is automated. Corpora have not only been used for linguistics research, they have also been used to compile dictionaries (starting with ''The American Heritage Dictionary of the English Language'' in 1969) and grammar guides, such as ''A Compreh ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lemmatisation
Lemmatisation ( or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form. In computational linguistics, lemmatisation is the algorithmic process of determining the lemma of a word based on its intended meaning. Unlike stemming, lemmatisation depends on correctly identifying the intended part of speech and meaning of a word in a sentence, as well as within the larger context surrounding that sentence, such as neighboring sentences or even an entire document. As a result, developing efficient lemmatisation algorithms is an open area of research. Description In many languages, words appear in several ''inflected'' forms. For example, in English, the verb 'to walk' may appear as 'walk', 'walked', 'walks' or 'walking'. The base form, 'walk', that one might look up in a dictionary, is called the ''lemma'' for the word. The association of the base form ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


XML Software
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding electronic document, documents in a format that is both Human-readable medium, human-readable and Machine-readable data, machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML. The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different Language, human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services. Several XML schema, schema systems exist to aid in the definition of XML-based languages, while programmers have developed many application programming interfaces (APIs) to ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]