Taxonomy For Search Engines
Taxonomy for search engines refers to classification methods that improve relevance in vertical search. Taxonomies of entities are tree structures whose nodes are labelled with entities likely to occur in a web search query. Searches use these trees to match keywords from a search query to keywords from answers (or snippets). Taxonomies, thesauri and concept hierarchies are crucial components for many applications of information retrieval, natural language processing and knowledge management. Building, tuning and managing taxonomies and ontologies are costly since a lot of manual operations are required. A number of studies proposed the automated building of taxonomies based on linguistic resources and/or statistical machine learning. A number of tools using SKOS standard (including Unilexicon, PoolParty and Lexaurus editor to name a few) are also available to streamline work with taxonomies. References {{Reflist See also * Feature extraction In machine learning, pattern r ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Vertical Search
A vertical search engine is distinct from a general web search engine, in that it focuses on a specific segment of online content. They are also called specialty or topical search engines. The vertical content area may be based on topicality, media type, or genre of content. Common verticals include shopping, the automotive industry, legal information, medical information, scholarly literature, job search and travel. Examples of vertical search engines include the Library of Congress, Mocavo, Nuroa, Trulia, and Yelp. In contrast to general web search engines, which attempt to index large portions of the World Wide Web using a web crawler, vertical search engines typically use a focused crawler which attempts to index only relevant web pages to a pre-defined topic or set of topics. Some vertical search sites focus on individual verticals, while other sites include multiple vertical searches within one search engine. Benefits Vertical search offers several potential benefits over ge ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Web Search Query
A web query or web search query is a query that a user enters into a web search engine to satisfy their information needs. Web search queries are distinctive in that they are often plain text and boolean search directives are rarely used. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters. Types There are three broad categories that cover most web search queries: informational, navigational, and transactional. These are also called "do, know, go." Although this model of searching was not theoretically derived, the classification has been empirically validated with actual search engine queries. * Informational queries – Queries that cover a broad topic (e.g., ''colorado'' or ''trucks'') for which there may be thousands of relevant results. * Navigational queries – Queries that seek a single website or web page of a single entity (e.g., ''youtube'' or ''delta air lines''). * Tra ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Taxonomy (general)
Taxonomy is the practice and science of categorization or classification. A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types. Among other things, a taxonomy can be used to organize and index knowledge (stored as documents, articles, videos, etc.), such as in the form of a library classification system, or a search engine taxonomy, so that users can more easily find the information they are searching for. Many taxonomies are hierarchies (and thus, have an intrinsic tree structure), but not all are. Originally, taxonomy referred only to the categorisation of organisms or a particular categorisation of organisms. In a wider, more general sense, it may refer to a categorisation of things or concepts, as well as to the principles underlying such a categorisation. Taxonomy organizes taxonomic units known as "taxa" (singular "taxon")." Taxonomy is different from me ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Thesauri
A thesaurus (plural ''thesauri'' or ''thesauruses'') or synonym dictionary is a reference work for finding synonyms and sometimes antonyms of words. They are often used by writers to help find the best word to express an idea: Synonym dictionaries have a long history. The word 'thesaurus' was used in 1852 by Peter Mark Roget for his ''Roget's Thesaurus''. While some thesauri, such as ''Roget's Thesaurus'', group words in a hierarchical hypernymic taxonomy of concepts, others are organized alphabetically or in some other way. Most thesauri do not include definitions, but many dictionaries include listings of synonyms. Some thesauri and dictionary synonym notes characterize the distinctions between similar words, with notes on their "connotations and varying shades of meaning".''American Heritage Dictionary of the English Language'', 5th edition, Houghton Mifflin Harcourt 2011, , p. xxvii Some synonym dictionaries are primarily concerned with differentiating synonyms by meaning ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hierarchy
A hierarchy (from Greek: , from , 'president of sacred rites') is an arrangement of items (objects, names, values, categories, etc.) that are represented as being "above", "below", or "at the same level as" one another. Hierarchy is an important concept in a wide variety of fields, such as architecture, philosophy, design, mathematics, computer science, organizational theory, systems theory, systematic biology, and the social sciences (especially political philosophy). A hierarchy can link entities either directly or indirectly, and either vertically or diagonally. The only direct links in a hierarchy, insofar as they are hierarchical, are to one's immediate superior or to one of one's subordinates, although a system that is largely hierarchical can also incorporate alternative hierarchies. Hierarchical links can extend "vertically" upwards or downwards via multiple links in the same direction, following a path. All parts of the hierarchy that are not linked vertically to one ano ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Information Retrieval
Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds. Automated information retrieval systems are used to reduce what has been called information overload. An IR system is a software system that provides access to books, journals and other documents; stores and manages those documents. Web search engines are the most visible IR applications. Overview An information retrieval process begins when a user or searcher enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In inf ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Natural Language Processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, t ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Knowledge Management
Knowledge management (KM) is the collection of methods relating to creating, sharing, using and managing the knowledge and information of an organization. It refers to a multidisciplinary approach to achieve organisational objectives by making the best use of knowledge. An established List of academic disciplines, discipline since 1991, KM includes courses taught in the fields of business administration, information systems, management, Library science, library, and information science. Other fields may contribute to KM research, including information and media, computer science, public health and policy, public policy. Several universities offer dedicated master's degrees in knowledge management. Many large companies, public institutions, and non-profit organisations have resources dedicated to internal KM efforts, often as a part of their strategic management, business strategy, information technology, IT, or human resource management departments. Several consulting companies ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Automatic Taxonomy Construction
Automatic taxonomy construction (ATC) is the use of software programs to generate taxonomical classifications from a body of texts called a corpus. ATC is a branch of natural language processing, which in turn is a branch of artificial intelligence. A taxonomy (or taxonomical classification) is a scheme of classification, especially, a hierarchical classification, in which things are organized into groups or types. Among other things, a taxonomy can be used to organize and index knowledge (stored as documents, articles, videos, etc.), such as in the form of a library classification system, or a search engine taxonomy, so that users can more easily find the information they are searching for. Many taxonomies are hierarchies (and thus, have an intrinsic tree structure), but not all are. Manually developing and maintaining a taxonomy is a labor-intensive task requiring significant time and resources, including familiarity of or expertise in the taxonomy's domain (scope, subject, or ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
SKOS
Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data. History DESIRE II project (1997–2000) The most direct ancestor to SKOS was the RDF Thesaurus work undertaken in the second phase of the EU DESIRE project . Motivated by the need to improve the user interface and usability of multi-service browsing and searching, a basic RDF vocabulary for Thesauri was produced. As noted later in the SWAD-Europe workplan, the DESIRE work was adopted and further developed in the SOSIG and LIMBER projects. A version of the DESIRE/SOSIG implementation was described in W3C's QL'98 workshop, motivating early work on RDF rule and query languages: A ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Feature Extraction
In machine learning, pattern recognition, and image processing, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning and generalization steps, and in some cases leading to better human interpretations. Feature extraction is related to dimensionality reduction. When the input data to an algorithm is too large to be processed and it is suspected to be redundant (e.g. the same measurement in both feet and meters, or the repetitiveness of images presented as pixels), then it can be transformed into a reduced set of features (also named a feature vector). Determining a subset of the initial features is called feature selection. The selected features are expected to contain the relevant information from the input data, so that the desired task can be performed by using this reduced representation instead of the complete initial data. General Feature extractio ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |