DocFetcher
   HOME
*





DocFetcher
DocFetcher is a free and open source desktop search application. It runs on Windows, Mac OS X and Linux and is written in Java. The application has a graphical user interface, which is written using the Standard Widget Toolkits. The program is an indexing search tool, meaning it has a local database of file content that it checks, rather than looking over all files on your machine. This means the program must always be running to monitor changes, but search results are instant. Search tools are based on Apache Lucene software, a widely-used, open source search engine. Features * Unicode support *Full text search for all major document file formats, including: **Office files (Microsoft Office, OpenDocument, Outlook ( PST), ...) **EPUB, PDF **RTF, SVG and any other plain text files **Audio metadata (MP3, FLAC) **Picture metadata (JPEG) **Archive formats ( ZIP, 7z, RAR, Tar). Also supports nested archive files **HTML with pair detection. Which means that DocFetcher detects whe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lucene
Apache Lucene is a free and open-source search engine software library, originally written in Java by Doug Cutting. It is supported by the Apache Software Foundation and is released under the Apache Software License. Lucene is widely used as a standard foundation for non-research search applications. Lucene has been ported to other programming languages including Object Pascal, Perl, C#, C++, Python, Ruby and PHP. History Doug Cutting originally wrote Lucene in 1999. Lucene was his fifth search engine, having previously written two while at Xerox PARC, one at Apple, and a fourth at Excite. It was initially available for download from its home at the SourceForge web site. It joined the Apache Software Foundation's Jakarta family of open-source Java products in September 2001 and became its own top-level Apache project in February 2005. The name Lucene is Doug Cutting's wife's middle name and her maternal grandmother's first name. Lucene formerly included a number of sub-projec ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


List Of Search Engines
Search engines, including web search engines, selection-based search engines, metasearch engines, desktop search tools, and web portals and vertical market websites have a search facility for online databases. By content/topic General † Main website is a portal Geographically localized Accountancy * IFACnet Business * Business.com * GenieKnows (United States and Canada) * GlobalSpec * Nexis (Lexis Nexis) * Thomasnet (United States) Computers * Shodan (website) Content * Openverse, search engine for open content. Dark web * Ahmia * Grams * TorSearch Education General: * Chegg * SkilledUp Academic materials only: * BASE (search engine) * ChemRefer * CiteULike * Google Scholar * Library of Congress * Semantic Scholar Enterprise *Apache Solr * Jumper 2.0: Universal search powered by Enterprise bookmarking * Oracle Corporation: Secure Enterprise Search 10g * Q-Sensei: Q-Sensei Enterprise * Swiftype: Swiftype Search * TeraText: TeraText ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Microsoft Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for servers, and Windows IoT for embedded systems. Defunct Windows families include Windows 9x, Windows Mobile, and Windows Phone. The first version of Windows was released on November 20, 1985, as a graphical operating system shell for MS-DOS in response to the growing interest in graphical user interfaces (GUIs). Windows is the most popular desktop operating system in the world, with 75% market share , according to StatCounter. However, Windows is not the most used operating system when including both mobile and desktop OSes, due to Android's massive growth. , the most recent version of Windows is Windows 11 for consumer PCs and tablets, Windows 11 Enterprise for corporations, and Windows Server 2022 for servers. Genealogy By marketing ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Scalable Vector Graphics
Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphics, having support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium since 1999. SVG images are defined in a vector graphics format and stored in XML text files. SVG images can thus be scaled in size without loss of quality, and SVG files can be searched, indexed, scripted, and compressed. The XML text files can be created and edited with text editors or vector graphics editors, and are rendered by the most-used web browsers. Overview SVG has been in development within the World Wide Web Consortium (W3C) since 1999 after six competing proposals for vector graphics languages had been submitted to the consortium during 1998 (see below). The early SVG Working Group decided not to develop any of the commercial submissions, but to create a new markup language that was informed by but not really based on any ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Proximity Search (text)
In text processing, a proximity search looks for documents where two or more separately matching term occurrences are within a specified distance, where distance is the number of intermediate words or characters. In addition to proximity, some implementations may also impose a constraint on the word order, in that the order in the searched text must be identical to the order of the search query. Proximity searching goes beyond the simple matching of words by adding the constraint of proximity and is generally regarded as a form of advanced search. For example, a search could be used to find "red brick house", and match phrases such as "red house of brick" or "house made of red brick". By limiting the proximity, these phrases can be matched while avoiding documents where the words are scattered or spread across a page or in unrelated articles in an anthology. Rationale The basic linguistic assumption of proximity searching is that the proximity of the words in a document implies a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Approximate String Matching
In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately (rather than exactly). The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately. Overview The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact match. This number is called the edit distance between the string and the pattern. The usual primitive operations are: * insertion: ''cot'' → ''coat'' * deletion: ''coat'' → ''cot'' * substitution: ''coat'' → ''cost'' These three operations may be generalized as forms of substitution by adding a NULL character (here symbolized by *) wherever a character has been deleted or inserted: * insertion: ''co*t'' → ''coat'' * delet ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Phrase Search
In computer science, phrase searching allows users to retrieve content from information systems (such as documents from file storage systems, records from databases, and web pages on the internet) that contains a specific order and combination of words defined by the user. Phrase search is one of many search operators that are standard in search engine technology A search engine is an information retrieval software program that discovers, crawls, transforms and stores information for retrieval and presentation in response to user queries. A search engine normally consists of four components, that are sear ..., along with Boolean operators (AND, OR, and NOT), truncation and wildcard operators (commonly represented by the asterisk symbol), field code operators (which look for specific words in defined fields, such as the Author field in a periodical database), and proximity operators (which look for defined words that appear close to one another, if not directly next to each other ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Wildcard Character
In software, a wildcard character is a kind of placeholder represented by a single character, such as an asterisk (), which can be interpreted as a number of literal characters or an empty string. It is often used in file searches so the full name need not be typed. Telecommunication In telecommunications, a wildcard is a character that may be substituted for any of a defined subset of all possible characters. * In high-frequency (HF) radio automatic link establishment, the wildcard character may be substituted for any one of the 36 upper-case alphanumeric characters. * Whether the wildcard character represents a single character or a string of characters must be specified. Computing In computer (software) technology, a wildcard is a symbol used to replace or represent one or more characters. Algorithms for matching wildcards have been developed in a number of recursive and non-recursive varieties. File and directory patterns When specifying file names (or paths) in CP/M, D ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Logical Connective
In logic, a logical connective (also called a logical operator, sentential connective, or sentential operator) is a logical constant. They can be used to connect logical formulas. For instance in the syntax of propositional logic, the binary connective \lor can be used to join the two atomic formulas P and Q, rendering the complex formula P \lor Q . Common connectives include negation, disjunction, conjunction, and implication. In standard systems of classical logic, these connectives are interpreted as truth functions, though they receive a variety of alternative interpretations in nonclassical logics. Their classical interpretations are similar to the meanings of natural language expressions such as English "not", "or", "and", and "if", but not identical. Discrepancies between natural language connectives and those of classical logic have motivated nonclassical approaches to natural language meaning as well as approaches which pair a classical compositional semantics wi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Regular Expressions
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory. The concept of regular expressions began in the 1950s, when the American mathematician Stephen Cole Kleene formalized the concept of a regular language. They came into common use with Unix text-processing utilities. Different syntaxes for writing regular expressions have existed since the 1980s, one being the POSIX standard and another, widely used, being the Perl syntax. Regular expressions are used in search engines, in search and replace dialogs of word processors and text editors, in text processing utilities such as sed and AWK, and in lexical analysis. Most gener ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript. Web browsers receive HTML documents from a web server or from local storage and render the documents into multimedia web pages. HTML describes the structure of a web page semantically and originally included cues for the appearance of the document. HTML elements are the building blocks of HTML pages. With HTML constructs, images and other objects such as interactive forms may be embedded into the rendered page. HTML provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes, and other items. HTML elements are delineated by ''tags'', written using angle brackets. Tags such as and directly introduce content into the page. Other tags such as surround ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Tar (computing)
In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned ''tar'' in favor of '' pax'', yet ''tar'' sees continued widespread use. History The command-line utility was first introduced in the Version 7 Unix in January 1979, replacing the tp program (which in turn replaced "tap"). The file structure to store this information was standardized in POSIX.1-1988 and later POSIX.1-2001, and became a format supported by most modern file archiving systems. The tar command was abandoned in POSIX.1-2001 in favor of pax command, which was to support ust ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]