DocFetcher
   HOME

TheInfoList



OR:

DocFetcher is a
free and open source Free and open-source software (FOSS) is a term used to refer to groups of software consisting of both free software and open-source software where anyone is freely licensed to use, copy, study, and change the software in any way, and the source ...
desktop search Desktop search tools search within a user's own computer files as opposed to searching the Internet. These tools are designed to find information on the user's PC, including web browser history, e-mail archives, text documents, sound files, images ...
application. It runs on
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
,
Mac OS X macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lap ...
and
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
and is written in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
. The application has a
graphical user interface The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, ins ...
, which is written using the Standard Widget Toolkits. The program is an indexing search tool, meaning it has a local database of file content that it checks, rather than looking over all files on your machine. This means the program must always be running to monitor changes, but search results are instant. Search tools are based on Apache Lucene software, a widely-used, open source search engine.


Features

* Unicode support *Full text search for all major document file formats, including: **Office files (
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
,
OpenDocument The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was develope ...
,
Outlook Outlook or The Outlook may refer to: Computing * Microsoft Outlook, an e-mail and personal information management software product from Microsoft * Outlook.com, a web mail service from Microsoft * Outlook on the web, a suite of web applications ...
( PST), ...) **
EPUB EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartpho ...
,
PDF Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
** RTF,
SVG Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphics, having support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium s ...
and any other plain text files **Audio metadata (
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
,
FLAC FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference softwa ...
) **Picture metadata (
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
) **Archive formats ( ZIP, 7z,
RAR RAR or Rar may refer to: * Radio acoustic ranging, a non-visual technique for determining a ship's position at sea * "rar", the ISO 639-2 code for the Cook Islands Māori language * RAR (file format), a proprietary compressed archive file format i ...
,
Tar Tar is a dark brown or black viscous liquid of hydrocarbons and free carbon, obtained from a wide variety of organic materials through destructive distillation. Tar can be produced from coal, wood, petroleum, or peat. "a dark brown or black bi ...
). Also supports nested archive files **
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaS ...
with pair detection. Which means that DocFetcher detects when an HTML file and a folder containing the resource files (Images, Scripts, ...) of the page belong together. (These resource files are usually downloaded when saving a Website) * Possibility to automatically detect file changes and update the index accordingly * Exclusion of files from indexing based on
regular expressions A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" o ...
* A query language supporting boolean operators (OR, AND, NOT), wildcards,
phrase search In computer science, phrase searching allows users to retrieve content from information systems (such as documents from file storage systems, records from databases, and web pages on the internet) that contains a specific order and combination of wo ...
,
fuzzy search In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately (rather than exactly). The problem of approximate string matching ...
and proximity search * World languages: translations in Chinese, Italian, Ukrainian. Partly translated to French, Japanese, Spanish, and German. Note that a commercial version of the program DocFetcher Pro is in development with additional features.


See also

* List of desktop search engines


References


External links


docfetcher.sourceforge.net
official website
documentation wiki
{{Desktop search navbox Desktop search engines Free search engine software Cross-platform software Java platform software Free software programmed in Java (programming language) Free and open-source software