Query Understanding

	Query Understanding Query understanding is the process of inferring the user intent, intent of a search engine (computing), search engine user by extracting semantic meaning from the searcher’s keywords. Query understanding methods generally take place before the search engine information retrieval, retrieves and ranking (information retrieval), ranks results. It is related to natural language processing but specifically focused on the understanding of search queries. Query understanding is at the heart of technologies like Amazon Alexa, Apple Inc., Apple's Siri. Google Assistant, IBM's Watson (computer), Watson, and Microsoft's Cortana (software), Cortana. Methods Tokenization Tokenization (lexical analysis), Tokenization is the process of breaking up a string (computer science), text string into words or other meaningful elements called tokens. Typically, tokenization occurs at the word level. However, it is sometimes difficult to define what is meant by a "word". Often a tokenizer relies on simpl ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	User Intent User intent, otherwise known as query intent or search intent, is the identification and categorization of what a user online intended or wanted to find when they typed their search terms into an online web search engine for the purpose of search engine optimisation or conversion rate optimisation. Examples of user intent are fact-checking, comparison shopping or navigating to other websites. Optimizing For User Intent To increase ranking on search engines, marketers need to create content that best satisfies queries entered by users on their smartphones or desktops. Creating content with user intent in mind helps increase the value of the information being showcased. Keyword research can help determine user intent. The search terms a user enters into a web search engine to find content, services, or products are the words that should be used on the webpage to optimize for user intent. Google, Petal, Sogou can show Search Engine Results Page ( SERP) features such as featured sn ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Text Segmentation Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental processes used by humans when reading text, and to artificial processes implemented in computers, which are the subject of natural language processing. The problem is non-trivial, because while some written languages have explicit word boundary markers, such as the word spaces of written English and the distinctive initial, medial and final letter shapes of Arabic, such signals are sometimes ambiguous and not present in all written languages. Compare speech segmentation, the process of dividing speech into linguistically meaningful portions. Segmentation problems Word segmentation Word segmentation is the problem of dividing a string of written language into its component words. In English and many other languages using some form of the Latin alphabet, the space is a good approximation of a word divider (word delimiter), a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Lemmatisation Lemmatisation ( or lemmatization) in linguistics is the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form. In computational linguistics, lemmatisation is the algorithmic process of determining the lemma of a word based on its intended meaning. Unlike stemming, lemmatisation depends on correctly identifying the intended part of speech and meaning of a word in a sentence, as well as within the larger context surrounding that sentence, such as neighboring sentences or even an entire document. As a result, developing efficient lemmatisation algorithms is an open area of research. Description In many languages, words appear in several ''inflected'' forms. For example, in English, the verb 'to walk' may appear as 'walk', 'walked', 'walks' or 'walking'. The base form, 'walk', that one might look up in a dictionary, is called the ''lemma'' for the word. The association of the base form ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Suffix In linguistics, a suffix is an affix which is placed after the stem of a word. Common examples are case endings, which indicate the grammatical case of nouns, adjectives, and verb endings, which form the conjugation of verbs. Suffixes can carry grammatical information (inflectional suffixes) or lexical information ( derivational/lexical suffixes'').'' An inflectional suffix or a grammatical suffix. Such inflection changes the grammatical properties of a word within its syntactic category. For derivational suffixes, they can be divided into two categories: class-changing derivation and class-maintaining derivation. Particularly in the study of Semitic languages, suffixes are called affirmatives, as they can alter the form of the words. In Indo-European studies, a distinction is made between suffixes and endings (see Proto-Indo-European root). Suffixes can carry grammatical information or lexical information. A word-final segment that is somewhere between a free morpheme and a b ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Affix In linguistics, an affix is a morpheme that is attached to a word stem to form a new word or word form. Affixes may be derivational, like English ''-ness'' and ''pre-'', or inflectional, like English plural ''-s'' and past tense ''-ed''. They are bound morphemes by definition; prefixes and suffixes may be separable affixes. Affixation is the linguistic process that speakers use to form different words by adding morphemes at the beginning (prefixation), the middle (infixation) or the end (suffixation) of words. Positional categories of affixes ''Prefix'' and ''suffix'' may be subsumed under the term ''adfix'', in contrast to ''infix.'' When marking text for interlinear glossing, as in the third column in the chart above, simple affixes such as prefixes and suffixes are separated from the stem with hyphens. Affixes which disrupt the stem, or which themselves are discontinuous, are often marked off with angle brackets. Reduplication is often shown with a tilde. Affixes which c ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Finnish Language Finnish ( endonym: or ) is a Uralic language of the Finnic branch, spoken by the majority of the population in Finland and by ethnic Finns outside of Finland. Finnish is one of the two official languages of Finland (the other being Swedish). In Sweden, both Finnish and Meänkieli (which has significant mutual intelligibility with Finnish) are official minority languages. The Kven language, which like Meänkieli is mutually intelligible with Finnish, is spoken in the Norwegian county Troms og Finnmark by a minority group of Finnish descent. Finnish is typologically agglutinative and uses almost exclusively suffixal affixation. Nouns, adjectives, pronouns, numerals and verbs are inflected depending on their role in the sentence. Sentences are normally formed with subject–verb–object word order, although the extensive use of inflection allows them to be ordered differently. Word order variations are often reserved for differences in information structure. Finnish orth ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Word Stem In linguistics, a word stem is a part of a word responsible for its lexical meaning. The term is used with slightly different meanings depending on the morphology of the language in question. In Athabaskan linguistics, for example, a verb stem is a root that cannot appear on its own and that carries the tone of the word. Athabaskan verbs typically have two stems in this analysis, each preceded by prefixes. In most cases, a word stem is not modified during its declension, while in some languages it can be modified (apophony) according to certain morphological rules or peculiarities, such as sandhi. For example in Polish: ("city"), but ("in the city"). In English: "sing", "sang", "sung". Uncovering and analyzing cognation between word stems and roots within and across languages has allowed comparative philology and comparative linguistics to determine the history of languages and language families. Usage In one usage, a word stem is a form to which affixes can be attached. T ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Root (linguistics) A root (or root word) is the core of a word that is irreducible into more meaningful elements. In morphology, a root is a morphologically simple unit which can be left bare or to which a prefix or a suffix can attach. The root word is the primary lexical unit of a word, and of a word family (this root is then called the base word), which carries aspects of semantic content and cannot be reduced into smaller constituents. Content words in nearly all languages contain, and may consist only of, root morphemes. However, sometimes the term "root" is also used to describe the word without its inflectional endings, but with its lexical endings in place. For example, ''chatters'' has the inflectional root or lemma ''chatter'', but the lexical root ''chat''. Inflectional roots are often called stems, and a root in the stricter sense, a root morpheme, may be thought of as a monomorphemic stem. The traditional definition allows roots to be either free morphemes or bound morphemes. Root ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Lemma (morphology) In morphology and lexicography, a lemma (plural ''lemmas'' or ''lemmata'') is the canonical form, dictionary form, or citation form of a set of word forms. In English, for example, ''break'', ''breaks'', ''broke'', ''broken'' and ''breaking'' are forms of the same lexeme, with ''break'' as the lemma by which they are indexed. ''Lexeme'', in this context, refers to the set of all the inflected or alternating forms in the paradigm of a single word, and ''lemma'' refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly inflected languages such as Arabic, Turkish and Russian. The process of determining the ''lemma'' for a given lexeme is called lemmatisation. The lemma can be viewed as the chief of the principal parts, although lemmatisation is at least partly arbitrary. Morphology The form of a word that is chosen to serve as the lemma is usually the least marked form, but there are several exceptions such as ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Inflection In linguistic morphology, inflection (or inflexion) is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and definiteness. The inflection of verbs is called ''conjugation'', and one can refer to the inflection of nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions and postpositions, numerals, articles, etc., as ''declension''. An inflection expresses grammatical categories with affixation (such as prefix, suffix, infix, circumfix, and transfix), apophony (as Indo-European ablaut), or other modifications. For example, the Latin verb ', meaning "I will lead", includes the suffix ', expressing person (first), number (singular), and tense-mood (future indicative or present subjunctive). The use of this suffix is an inflection. In contrast, in the English clause "I will lead", the word ''lead'' is not inflected for any of pe ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Noisy Channel Model The noisy channel model is a framework used in spell checkers, question answering, speech recognition, and machine translation. In this model, the goal is to find the intended word given a word where the letters have been scrambled in some manner. In spell-checking See Chapter B of. Given an alphabet \Sigma, let \Sigma^* be the set of all finite strings over \Sigma. Let the dictionary D of valid words be some subset of \Sigma^, i.e., D\subseteq\Sigma^. The noisy channel is the matrix :\Gamma_ = \Pr(s, w), where w\in D is the intended word and s\in\Sigma^* is the scrambled word that was actually received. The goal of the noisy channel model is to find the intended word given the scrambled word that was received. The decision function \sigma : \Sigma^* \to D is a function that, given a scrambled word, returns the intended word. Methods of constructing a decision function include the maximum likelihood rule, the maximum a posteriori rule, and the minimum distance rule. In ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	A Priori Probability An ''a priori'' probability is a probability that is derived purely by deductive reasoning. One way of deriving ''a priori'' probabilities is the principle of indifference, which has the character of saying that, if there are ''N'' mutually exclusive and collectively exhaustive events and if they are equally likely, then the probability of a given event occurring is 1/''N''. Similarly the probability of one of a given collection of ''K'' events is ''K'' / ''N''. One disadvantage of defining probabilities in the above way is that it applies only to finite collections of events. In Bayesian inference, " uninformative priors" or "objective priors" are particular choices of ''a priori'' probabilities. Note that "prior probability" is a broader concept. Similar to the distinction in philosophy between a priori and a posteriori, in Bayesian inference ''a priori'' denotes general knowledge about the data distribution before making an inference, while ''a posteriori'' denotes knowledge t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]