HOME

TheInfoList



OR:

Keyword spotting (or more simply, word spotting) is a problem that was historically first defined in the context of
speech processing Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied t ...
. In speech processing, keyword spotting deals with the identification of keywords in
utterances In spoken language analysis, an utterance is a continuous piece of speech, often beginning and ending with a clear pause. In the case of oral languages, it is generally, but not always, bounded by silence. Utterances do not exist in written langu ...
. Keyword spotting is also defined as a separate, but related, problem in the context of document image processing. In document image processing, keyword spotting is the problem of finding all instances of a query word that exist in a scanned document image, without fully recognizing it.


In speech processing

The first works in keyword spotting appeared in the late 1980s. A special case of keyword spotting is wake word (also called hot word) detection used by personal digital assistants such as
Alexa Alexa may refer to: Technology *Amazon Alexa, a virtual assistant developed by Amazon * Alexa Internet, a defunct website ranking and traffic analysis service * Arri Alexa, a digital motion picture camera People * Alexa (name), a given name a ...
or
Siri Siri ( ) is a virtual assistant that is part of Apple Inc.'s iOS, iPadOS, watchOS, macOS, tvOS, and audioOS operating systems. It uses voice queries, gesture based control, focus-tracking and a natural-language user interface to answer qu ...
to activate the dormant speaker, in other words "wake up" when their name is spoken. In the United States, the
National Security Agency The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collect ...
has made use of keyword spotting since at least 2006. This technology allows analysts to search through large volumes of recorded conversations and isolate mentions of suspicious keywords. Recordings can be indexed and analysts can run queries over the database to find conversations of interest.
IARPA The Intelligence Advanced Research Projects Activity (IARPA) is an organization within the Office of the Director of National Intelligence responsible for leading research to overcome difficult challenges relevant to the United States Intellige ...
funded research into keyword spotting in the
Babel program The IARPA Babel program developed speech recognition technology for noisy telephone conversations. The main goal of the program was to improve the performance of keyword search on languages with very little transcribed data, i.e. low-resource langua ...
. Some algorithms used for this task are: *
Sliding window A sliding window protocol is a feature of packet-based data transmission protocols. Sliding window protocols are used where reliable in-order delivery of packets is required, such as in the data link layer ( OSI layer 2) as well as in the Tran ...
and garbage model * K-best hypothesis *
Iterative Viterbi decoding Iterative Viterbi decoding is an algorithm that spots the subsequence ''S'' of an observation ''O'' = having the highest average probability (i.e., probability scaled by the length of ''S'') of being generated by a given hidden Markov model ''M'' w ...
*
Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
on Mel-frequency cepstrum coefficients *
Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...
-based small-footprint keyword spotting


In document image processing

Keyword spotting in document image processing can be seen as an instance of the more generic problem of
content-based image retrieval Content-based image retrieval, also known as query by image content ( QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching ...
(CBIR). Given a query, the goal is to retrieve the most relevant instances of words in a collection of scanned documents. The query may be a text string (query-by-string keyword spotting) or a word image (query-by-example keyword spotting).


References

{{reflist Pattern recognition