List Of Information Retrieval Libraries

	List Of Information Retrieval Libraries This is a list of free information retrieval libraries, which are libraries used in software development for performing it retrieval functions. It is not a complete list of such libraries, but is instead a list of free information retrieval libraries with articles on Wikipedia. It does not include commercial software libraries. Libraries for searching and indexing Lemur Lucene Solr Elasticsearch Manatee Sphinx * Terrier Search Engine Xapian Xapian is a free and open-source probabilistic information retrieval library, released under the GNU General Public License (GPL). It is a full-text search engine library for programmers. It is written in C++, with bindings to allow use from P ... {{DEFAULTSORT:Information retrieval libraries Lists of software ... [...More Info...] [...Related Items...] OR:* [Wikipedia] [Google] [Baidu]
picture info	Library (computing) In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subroutines, classes, values or type specifications. In IBM's OS/360 and its successors they are referred to as partitioned data sets. A library is also a collection of implementations of behavior, written in terms of a language, that has a well-defined interface by which the behavior is invoked. For instance, people who want to write a higher-level program can use a library to make system calls instead of implementing those system calls over and over again. In addition, the behavior is provided for reuse by multiple independent programs. A program invokes the library-provided behavior via a mechanism of the language. For example, in a simple imperative language such as C, the behavior in a library is invoked by using C's normal func ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Software Development Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development involves writing and maintaining the source code, but in a broader sense, it includes all processes from the conception of the desired software through to the final manifestation of the software, typically in a planned and structured process. Software development also includes research, new development, prototyping, modification, reuse, re-engineering, maintenance, or any other activities that result in software products. Methodologies One system development methodology is not necessarily suitable for use by all projects. Each of the available methodologies are best suited to specific kinds of projects, based on various technical, organizational, project, and team considerations. Software development activities Identification of need The ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Information Retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds. Automated information retrieval systems are used to reduce what has been called information overload. An IR system is a software system that provides access to books, journals and other documents; stores and manages those documents. Web search engines are the most visible IR applications. Overview An information retrieval process begins when a user or searcher enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Lemur Project The Lemur Project is a collaboration between the Center for Intelligent Information Retrieval at the University of Massachusetts Amherst and the Language Technologies Institute at Carnegie Mellon University. The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri and Galago search engines, the ClueWeb09 and ClueWeb12 datasets, and the RankLib learning-to-rank library. The software and datasets are used widely in scientific and research applications, as well as in some commercial applications. The Lemur Project's software development philosophy emphasizes state-of-the-art accuracy, flexibility, and efficiency. For example, the Indri search engine provides accurate search for large text collections 'out of the box', and data is stored in an accessible manner to support development of new retrieval strategies. S ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Apache Lucene Apache Lucene is a free and open-source software, free and open-source Search engine (computing), search engine Library (computing), software library, originally written in Java (programming language), Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as a standard foundation for non-research search applications. Lucene has been ported to other programming languages including Object Pascal, Perl, C Sharp (programming language), C#, C++, Python (programming language), Python, Ruby (programming language), Ruby and PHP. History Doug Cutting originally wrote Lucene in 1999. Lucene was his fifth search engine, having previously written two while at Xerox PARC, one at Apple, and a fourth at Excite. It was initially available for download from its home at the SourceForge web site. It joined the Apache Software Foundation's Jakarta Project, Jakarta family of open-source Java products in Septem ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Apache Solr Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is designed for scalability and fault tolerance. Solr is widely used for enterprise search and analytics use cases and has an active development community and regular releases. Solr runs as a standalone full-text search server. It uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages. Solr's external configuration allows it to be tailored to many types of applications without Java coding, and it has a plugin architecture to support more advanced customization. Apache Solr is developed in an open, collaborat ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Elasticsearch Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is dual-licensed under the source-available Server Side Public License and the Elastic license, while other parts fall under the proprietary ( ''source-available'') ''Elastic License''. Official clients are available in Java, .NET ( C#), PHP, Python, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine. History Shay Banon created the precursor to Elasticsearch, called Compass, in 2004. While thinking about the third version of Compass he realized that it would be necessary to rewrite big parts of Compass to "create a scalable search solution". So he created "a solution built from the ground up to be distributed" and used a common interface, JSON over HTTP, suitable fo ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Sketch Engine Sketch Engine is a corpus manager and text analysis software developed by Lexical Computing CZ s.r.o. since 2003. Its purpose is to enable people studying language behaviour ( lexicographers, researchers in corpus linguistics, translators or language learners) to search large text collections according to complex and linguistically motivated queries. Sketch Engine gained its name after one of the key features, word sketches: one-page, automatic, corpus-derived summaries of a word's grammatical and collocational behaviour. Currently, it supports and provides corpora in 90+ languages. History of development Sketch Engine is a product of Lexical Computing Limited, a company founded in 2003 by the lexicographer and research scientist Adam Kilgarriff. He started a collaboration with Pavel Rychlý, a computer scientist working at the Natural Language Processing Centre, Masaryk University, and the developer of Manatee and Bonito (two major parts of the software suite), and introduce ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Sphinx (search Engine) Sphinx is a fulltext Search engine (computing), search engine that provides text search functionality to client applications. Overview Sphinx can be used either as a stand-alone server or as a storage engine ("SphinxSE") for the MySQL family of databases. When run as a standalone server Sphinx operates similar to a DBMS and can communicate with MySQL, MariaDB and PostgreSQL through their native protocols or with any ODBC-compliant DBMS via ODBC. MariaDB, a fork of MySQL, is distributed with SphinxSE. SphinxAPI If Sphinx is run as a stand-alone server, it is possible to use SphinxAPI to connect an application to it. Official implementations of the API are available for PHP, Java (programming language), Java, Perl, Ruby (programming language), Ruby and Python (programming language), Python languages. Unofficial implementations for other languages, as well as various third party plugins and modules are also available. Other data sources can be indexed via pipe in a custom XML for ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Terrier Search Engine The Terrier IR Platform is a modular open source software for the rapid development of large-scale information retrieval applications. Terrier was developed by members of thInformation Retrieval Research Group Department of Computing Science, at the University of Glasgow. A core version of Terrier is available as open source software under the Mozilla Public License (MPL), with the aim to facilitate experimentation and research in the wider information retrieval community. Terrier is written in Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo .... References BibliographyTerrier: A High Performance and Scalable Information Retrieval Platform (pdf)- Iadh Ounis, Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald, and Christina Lioma. In Proceedings of ACM SIGIR'06 Worksh ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Xapian Xapian is a free and open-source probabilistic information retrieval library, released under the GNU General Public License (GPL). It is a full-text search engine library for programmers. It is written in C++, with bindings to allow use from Perl, Python (2 and 3), PHP (5 and 7), Java, Tcl, C#, Ruby, Lua, Erlang, Node.js and R. Xapian is highly portable and runs on Linux, OS X, FreeBSD, NetBSD, OpenBSD, Solaris, HP-UX, AIX, Windows, OS/2 and Hurd, as well as Tru64. Xapian grew out of the Muscat search engine, written by Dr. Martin F. Porter at the University of Cambridge. The first official release of Xapian was version 0.5.0 on September 20, 2002. Xapian allows developers to add advanced indexing and search facilities to their own applications. Organisations and projects using Xapian include the Library of the University of Cologne, Debian, Die Zeit, MoinMoin, and One Laptop per Child. Features * Supports Unicode 9.0 (including codepoints beyond the BMP) and stor ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]