Web Query Classification

	Web Query Classification A Web query topic classification/categorization is a problem in information science. The task is to assign a Web search query to one or more predefined categories, based on its topics. The importance of query classification is underscored by many services provided by Web search. A direct application is to provide better search result pages for users with interests of different categories. For example, the users issuing a Web query "''apple''" might expect to see Web pages related to the fruit apple, or they may prefer to see products or news related to the computer company. Online advertisement services can rely on the query classification results to promote different products more accurately. Search result pages can be grouped according to the categories predicted by a query classification algorithm. However, the computation of query classification is non-trivial. Different from the document classification tasks, queries submitted by Web search users are usually short and ambiguous; ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Information Science Information science (also known as information studies) is an academic field which is primarily concerned with analysis, collection, Categorization, classification, manipulation, storage, information retrieval, retrieval, movement, dissemination, and protection of information. Practitioners within and outside the field study the application and the usage of knowledge in organizations in addition to the interaction between people, organizations, and any existing information systems with the aim of creating, replacing, improving, or understanding the information systems. Historically, information science (informatics) is associated with computer science, data science, psychology, technology, library science, healthcare, and intelligence agency, intelligence agencies. However, information science also incorporates aspects of diverse fields such as archival science, cognitive science, commerce, law, linguistics, museology, management, mathematics, philosophy, Policy, public po ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Association Rules Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Piatetsky-Shapiro, Gregory (1991), ''Discovery, analysis, and presentation of strong rules'', in Piatetsky-Shapiro, Gregory; and Frawley, William J.; eds., ''Knowledge Discovery in Databases'', AAAI/MIT Press, Cambridge, MA. In any given transaction with a variety of items, association rules are meant to discover the rules that determine how or why certain items are connected. Based on the concept of strong rules, Rakesh Agrawal, Tomasz Imieliński and Arun Swami introduced association rules for discovering regularities between products in large-scale transaction data recorded by point-of-sale (POS) systems in supermarkets. For example, the rule \ \Rightarrow \ found in the sales data of a supermarket would indicate that if a customer buy ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Meta Search A metasearch engine (or search aggregator) is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users. Problems such as spamming reduces the accuracy and precision of results. The process of fusion aims to improve the engineering of a metasearch engine. Examples of metasearch engines include Skyscanner and Kayak.com, which aggregate search results of online travel agencies and provider websites and Searx, a free and open-source search engine which aggregates results from internet search engines. History The first person to incorporate the idea of meta searching was Daniel Dreilinger of Colorado State University . He developed SearchSavvy, which let users search up to 20 different search engines and directories at once. Although fast, the search engine was restricted to simple ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Support Vector Machines In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Cortes and Vapnik, 1995, Vapnik et al., 1997) SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974). Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non- probabilistic binary linear classifier (although methods such as Platt scaling exist to use SVM in a probabilistic classification setting). SVM maps training examples to points in space so as to maximise the width of the gap between the two categories. New ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Naive Bayes Classifier In statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bayesian network models, but coupled with kernel density estimation, they can achieve high accuracy levels. Naive Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem. Maximum-likelihood training can be done by evaluating a closed-form expression, which takes linear time, rather than by expensive iterative approximation as used for many other types of classifiers. In the statistics literature, naive Bayes models are known under a variety of names, including simple Bayes and independence Bayes. All these names reference the use of Bayes' theorem in the classifier's decision rule, but naive Bayes is not (necessarily) a Bayesian method. Introductio ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Query Expansion Query expansion (QE) is the process of reformulating a given query to improve retrieval performance in information retrieval operations, particularly in the context of query understanding. In the context of search engines, query expansion involves evaluating a user's input (what words were typed into the search query area, and sometimes other types of data) and expanding the search query to match additional documents. Query expansion involves techniques such as: * Finding synonyms of words, and searching for the synonyms as well * Finding semantically related words (e.g. antonyms, meronyms, hyponyms, hypernyms) * Finding all the various morphological forms of words by stemming each word in the search query * Fixing spelling errors and automatically searching for the corrected form or suggesting it in the results * Re-weighting the terms in the original query Query expansion is a methodology studied in the field of computer science, particularly within the realm of natural langu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Information Retrieval Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds. Automated information retrieval systems are used to reduce what has been called information overload. An IR system is a software system that provides access to books, journals and other documents; stores and manages those documents. Web search engines are the most visible IR applications. Overview An information retrieval process begins when a user or searcher enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In inf ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Document Classification Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification. The documents to be classified may be texts, images, music, etc. Each kind of document possesses its special classification problems. When not otherwise specified, text classification is implied. Documents may be classified according to their subjects or according to other attributes (such as document type, author, printing year etc.). In the rest of this article only subject classification is considered. T ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Online Advertising Online advertising, also known as online marketing, Internet advertising, digital advertising or web advertising, is a form of marketing and advertising which uses the Internet to promote products and services to audiences and platform users. Online advertising includes email marketing, search engine marketing (SEM), social media marketing, many types of display advertising (including web banner advertising), and mobile advertising. Advertisements are increasingly being delivered via automated software systems operating across multiple websites, media services and platforms, known as programmatic advertising. Like other advertising media, online advertising frequently involves a publisher, who integrates advertisements into its online content, and an advertiser, who provides the advertisements to be displayed on the publisher's content. Other potential participants include advertising agencies who help generate and place the ad copy, an ad server which technologically deliver ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Vertical Search A vertical search engine is distinct from a general web search engine, in that it focuses on a specific segment of online content. They are also called specialty or topical search engines. The vertical content area may be based on topicality, media type, or genre of content. Common verticals include shopping, the automotive industry, legal information, medical information, scholarly literature, job search and travel. Examples of vertical search engines include the Library of Congress, Mocavo, Nuroa, Trulia, and Yelp. In contrast to general web search engines, which attempt to index large portions of the World Wide Web using a web crawler, vertical search engines typically use a focused crawler which attempts to index only relevant web pages to a pre-defined topic or set of topics. Some vertical search sites focus on individual verticals, while other sites include multiple vertical searches within one search engine. Benefits Vertical search offers several potential benefits over ge ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Metasearch A metasearch engine (or search aggregator) is an online information retrieval tool that uses the data of a web search engine to produce its own results. Metasearch engines take input from a user and immediately query search engines for results. Sufficient data is gathered, ranked, and presented to the users. Problems such as spamming reduces the accuracy and precision of results. The process of fusion aims to improve the engineering of a metasearch engine. Examples of metasearch engines include Skyscanner and Kayak.com, which aggregate search results of online travel agencies and provider websites and Searx, a free and open-source search engine which aggregates results from internet search engines. History The first person to incorporate the idea of meta searching was Daniel Dreilinger of Colorado State University . He developed SearchSavvy, which let users search up to 20 different search engines and directories at once. Although fast, the search engine was restricted to simple ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Open Directory Project DMOZ (from ''directory.mozilla.org'', an earlier domain name, stylized in lowercase in its logo) was a multilingual open-content directory of World Wide Web links. The site and community who maintained it were also known as the Open Directory Project (ODP). It was owned by AOL (now a part of Verizon Media) but constructed and maintained by a community of volunteer editors. DMOZ used a hierarchical ontology scheme for organizing site listings. Listings on a similar topic were grouped into categories which then included smaller categories. DMOZ closed on March 17, 2017 because AOL no longer wished to support the project. The website became a single landing page on that day, with links to a static archive of DMOZ, and to the DMOZ discussion forum, where plans to rebrand and relaunch the directory are being discussed. , a non-editable mirror remained available at dmoztools.net, and it was announced that while the DMOZ URL would not return, a successor version of the directory nam ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]