Recoll
   HOME

TheInfoList



OR:

Recoll is a
desktop search Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images ...
tool that provides
full text search In text retrieval, full-text search refers to techniques for searching a single computer-stored document or a collection in a full-text database. Full-text search is distinguished from searches based on metadata or on parts of the original texts r ...
(from single-word to arbitrarily complex boolean searches) in a
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
with few mandatory external dependencies. It runs under many
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
-like
operating systems An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also inc ...
, and is mostly independent of the
desktop environment In computing, a desktop environment (DE) is an implementation of the desktop metaphor made of a bundle of programs running on top of a computer operating system that share a common graphical user interface (GUI), sometimes described as a graphica ...
. It has been ported to
OS/2 OS/2 (Operating System/2) is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 ...
, and is planned for integration into the OS/2-based
ArcaOS ArcaOS is an operating system based on OS/2, developed and marketed by Arca Noae, LLC under license from IBM. It was codenamed Blue Lion during its development. It builds on OS/2 Warp 4.52 by adding support for new hardware, fixing defects and l ...
. Recoll was designed not to require a permanent
daemon Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and mythology and of later Hellenistic religion and philosophy. The word ...
, but on Linux systems, it can make use of
inotify inotify ( inode notify) is a Linux kernel subsystem created by John McCutchan, which monitors changes to the filesystem, and reports those changes to applications. It can be used to automatically update directory views, reload configuration files ...
. Recoll updates its index at designed intervals (for example through
cronjobs The cron command-line utility is a job scheduler on Unix-like operating systems. Users who set up and maintain software environments use cron to schedule jobs (commands or shell scripts), also known as cron jobs, to run periodically at fixed ti ...
), but if desired, the indexing task can run as a file-system monitoring daemon for real-time index updates.


Features

* Qt
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
. *
Xapian Xapian is a free and open-source probabilistic information retrieval library, released under the GNU General Public License (GPL). It is a full-text search engine library for programmers. It is written in C++, with bindings to allow use from ...
backend. * Indexes the contents of many document types: text,
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
,
email Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" meant ...
stores of all kinds,
OpenDocument The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed wi ...
,
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
and
Office Open XML Office Open XML (also informally known as OOXML) is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version a ...
,
AbiWord AbiWord () is a free and open-source software word processor. It is written in C++ and since version 3 it is based on GTK+ 3. The name "AbiWord" is derived from the root of the Spanish word "'' abierto''", meaning "open".Project MascoAbi the Ant ...
,
KWord KWord is a deprecated word processor and a desktop publishing application, part of the KOffice suite. It has been obsoleted by Calligra Words of the Calligra Suite. History KWord was created by Reginald Stadlbauer as part of the KOffice projec ...
,
Gaim Pidgin (formerly named Gaim) is a free and open-source multi-platform instant messaging client, based on a library named libpurple that has support for many instant messaging protocols, allowing the user to simultaneously log in to various s ...
,
Lyx LyX (styled as ; pronounced ) (Based on 3 developers, they say it can be pronounced "Licks", "Lucks" and "Leeks") is an open source, graphical user interface document processor based on the LaTeX typesetting system. Unlike most word processors, ...
,
Scribus Scribus () is free and open-source desktop publishing (DTP) software available for most desktop operating systems. It is designed for layout, typesetting, and preparation of files for professional-quality image-setting equipment. Scribus can a ...
,
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
, WordPerfect,
PostScript PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, Doug Br ...
, RTF,
TeX Tex may refer to: People and fictional characters * Tex (nickname), a list of people and fictional characters with the nickname * Joe Tex (1933–1982), stage name of American soul singer Joseph Arrington Jr. Entertainment * ''Tex'', the Italian ...
,
DVI Digital Visual Interface (DVI) is a video display interface developed by the Digital Display Working Group (DDWG). The digital interface is used to connect a video source, such as a video display controller, to a display device, such as a comp ...
,
DjVu DjVu ( , like French "déjà vu") is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, indexed color images, and photographs. It uses technologies such as ima ...
,
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
and other audio file formats,
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
and other image file formats.Recoll features
/ref> * Recursively processes embedded documents (
E-Mail Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" meant ...
attachments, Zip archives) to arbitrary depths. * Query facilities, with boolean searches, wildcards, phrases, proximity, and filter on file types, and directory trees. GUI Boolean search build tool. * Xesam query language support. * Word
stemming In linguistic morphology and information retrieval, stemming is the process of reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form. The stem need not be identical to the morpholog ...
is performed at query time (can switch stemming language after indexing). * Multiple indexes selectable at query time (i.e. personal + system indexes). * Natively based on Unicode. Supports many languages and character sets, including good support for East Asian texts ( CJK). * MD5 document hashes for the elimination of duplicates in results. * Batch and real-time indexing modes. *
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
. *
GNOME Shell GNOME Shell is the graphical shell of the GNOME desktop environment starting with version 3, which was released on April 6, 2011. It provides basic functions like launching applications, switching between windows and is also a widget engine. ...
search provider, WEB interface, and
Firefox Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current and ...
history extensions.


File type supported


File types indexed natively

* Text. * Html. * Maildir, MH, and mailbox (Mozilla, Thunderbird, and Evolution mail ok). Evolution note: be sure to remove .cache from the skippedNames list in the GUI Indexing preferences/Local Parameters/ pane if you want to index local copies of Imap mail. * Gaim and purple log files. * Scribus files. * Man pages (needs Groff). * Mimehtml web archive format (support based on the mail filter, which introduces some mild weirdness, but is still usable). * All the following need Python3: * Dia diagrams. * Excel and Powerpoint (pre-open-XML). * Tar archives. Tar file indexing is disabled by default (because tar archives don't typically contain the kind of documents that people search for), you will need to enable it explicitly, like with the following in your $HOME/.recoll/mimeconf file: ndex application/x-tar = execm rcltar * Zip archives. * Konqueror web archive format (uses the tarfile Python standard library module).


File types indexed with external helpers

* PDF files. * MS-Word files. * Wordperfect files. * RTF files. * Image and audio file tags. * Abiword files. * Fb2, Epub, and CHM ebooks. * Kword files. * Microsoft Office traditional and Open XML files. * OpenOffice files. * SVG files. * Okular annotations files. * HWP files (without page numbering).


See also

*
Desktop search Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images ...
* List of desktop search engines


References


External links

* {{Navigationbox Desktopsearch Desktop search engines Software that uses Qt