Speechbot
   HOME
*





Speechbot
SpeechBot was a web search engine for streaming media content developed at Compaq's (later HP) research laboratories in Cambridge, MA and Australia. Compaq launched the website at Streaming Media West 1999 in San Jose, CA. The internet radio shows indexed by SpeechBot included The Motley Fool, Fresh Air, Talk of the Nation, The Dr. Laura Program, and Dreamland with Art Bell. By June 2003, the service had indexed over 17,000 hours of multimedia content. The website was taken offline in 2005, after HP closed their Cambridge research lab. The SpeechBot indexing workflow involved a farm of Windows workstations that retrieved the streaming content; and a Linux cluster running speech recognition to transcribe the spoken audio. The web server, search index and metadata library were hosted on AlphaServers running Tru64 UNIX. If transcripts were already available, then these were aligned to the audio stream; otherwise, an approximate transcript was produced using speech recognition. T ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Web Search Engine
A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a line of results, often referred to as search engine results pages (SERPs). When a user enters a query into a search engine, the engine scans its index of web pages to find those that are relevant to the user's query. The results are then ranked by relevancy and displayed to the user. The information may be a mix of links to web pages, images, videos, infographics, articles, research papers, and other types of files. Some search engines also mine data available in databases or open directories. Unlike web directories and social bookmarking sites, which are maintained by human editors, search engines also maintain real-time information by running an algorithm on a web crawler. Any internet-based content that can't be indexed and searched ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which includes the kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name "GNU/Linux" to emphasize the importance of GNU software, causing some controversy. Popular Linux distributions include Debian, Fedora Linux, and Ubuntu, the latter of which itself consists of many different distributions and modifications, including Lubuntu and Xubuntu. Commercial distributions include Red Hat Enterprise Linux and SUSE Linux Enterprise. Desktop Linux distributions include a windowing system such as X11 or Wayland, and a desktop environment such as GNOME or KDE Plasma. Distributions intended for ser ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

1999 Software
File:1999 Events Collage.png, From left, clockwise: The funeral procession of King Hussein of Jordan in Amman; the 1999 İzmit earthquake kills over 17,000 people in Turkey; the Columbine High School massacre, one of the first major school shootings in the United States; the Year 2000 problem ("Y2K"), perceived as a major concern in the lead-up to the year 2000; the Millennium Dome opens in London; online music downloading platform Napster is launched, soon a source of online piracy; NASA loses both the Mars Climate Orbiter and the Mars Polar Lander; a destroyed T-55 tank near Prizren during the Kosovo War., 300x300px, thumb rect 0 0 200 200 Death and state funeral of King Hussein rect 200 0 400 200 1999 İzmit earthquake rect 400 0 600 200 Columbine High School massacre rect 0 200 300 400 Kosovo War rect 300 200 600 400 Year 2000 problem rect 0 400 200 600 Mars Climate Orbiter rect 200 400 400 600 Napster rect 400 400 600 600 Millennium Dome 1999 was designated as the Interna ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Defunct Internet Search Engines
Defunct (no longer in use or active) may refer to: * ''Defunct'' (video game), 2014 * Zombie process or defunct process, in Unix-like operating systems See also * * :Former entities * End-of-life product * Obsolescence Obsolescence is the state of being which occurs when an object, service, or practice is no longer maintained or required even though it may still be in good working order. It usually happens when something that is more efficient or less risky r ...
{{Disambiguation ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Word Error Rate
Word error rate (WER) is a common metric of the performance of a speech recognition or machine translation system. The general difficulty of measuring performance lies in the fact that the recognized word sequence can have a different length from the reference word sequence (supposedly the correct one). The WER is derived from the Levenshtein distance, working at the word level instead of the phoneme level. The WER is a valuable tool for comparing different systems as well as for evaluating improvements within one system. This kind of measurement, however, provides no details on the nature of translation errors and further work is therefore required to identify the main source(s) of error and to focus any research effort. This problem is solved by first aligning the recognized word sequence with the reference (spoken) word sequence using dynamic string alignment. Examination of this issue is seen through a theory called the power law that states the correlation between perplexity an ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


CMU Sphinx
CMU Sphinx, also called Sphinx for short, is the general term to describe a group of speech recognition systems developed at Carnegie Mellon University. These include a series of speech recognizers (Sphinx 2 - 4) and an acoustic model trainer (SphinxTrain). In 2000, the Sphinx group at Carnegie Mellon committed to open source several speech recognizer components, including Sphinx 2 and later Sphinx 3 (in 2001). The speech decoders come with acoustic models and sample applications. The available resources include in addition software for acoustic model training, language model compilation and a public domain pronunciation dictionary, cmudict. Sphinx encompasses a number of software systems, described below. Sphinx Sphinx is a continuous-speech, speaker-independent recognition system making use of hidden Markov acoustic models ( HMMs) and an n-gram statistical language model. It was developed by Kai-Fu Lee. Sphinx featured feasibility of continuous-speech, speaker-independe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Transcription (linguistics)
Transcription in the linguistic sense is the systematic representation of spoken language in written form. The source can either be utterances (''speech'' or ''sign language'') or preexisting text in another writing system. Transcription should not be confused with translation, which means representing the meaning of text from a source-language in a target language, (e.g. ''Los Angeles'' (from source-language Spanish) means ''The Angels'' in the target language English); or with transliteration, which means representing the spelling of a text from one script to another. In the academic discipline of linguistics, transcription is an essential part of the methodologies of (among others) phonetics, conversation analysis, dialectology, and sociolinguistics. It also plays an important role for several subfields of speech technology. Common examples for transcriptions outside academia are the proceedings of a court hearing such as a criminal trial (by a court reporter) or a physicia ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Tru64 UNIX
Tru64 UNIX is a discontinued 64-bit UNIX operating system for the Alpha instruction set architecture (ISA), currently owned by Hewlett-Packard (HP). Previously, Tru64 UNIX was a product of Compaq, and before that, Digital Equipment Corporation (DEC), where it was known as Digital UNIX (originally DEC OSF/1 AXP). As its original name suggests, Tru64 UNIX is based on the OSF/1 operating system. DEC's previous UNIX product was known as Ultrix and was based on BSD. It is unusual among commercial UNIX implementations, as it is built on top of the Mach kernel developed at Carnegie Mellon University. (Other UNIX and UNIX-like implementations built on top of the Mach kernel are GNU Hurd, NeXTSTEP, MkLinux, macOS and Apple iOS.) Tru64 UNIX required the SRM boot firmware found on Alpha-based computer systems. DEC OSF/1 AXP In 1988, Digital Equipment Corporation (DEC) joined with IBM, Hewlett-Packard, and others to form the Open Software Foundation (OSF). A primary aim was to dev ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

AlphaServer
AlphaServer is a series of server computers, produced from 1994 onwards by Digital Equipment Corporation, and later by Compaq and HP. AlphaServers were based on the DEC Alpha 64-bit microprocessor. Supported operating systems for AlphaServers are Tru64 UNIX (formerly Digital UNIX), OpenVMS, MEDITECH MAGIC and Windows NT (on earlier systems, with AlphaBIOS ARC firmware), while enthusiasts have provided alternative operating systems such as Linux, NetBSD, OpenBSD and FreeBSD. The Alpha processor was also used in a line of workstations, AlphaStation. Some AlphaServer models were rebadged in white enclosures as Digital Servers for the Windows NT server market. These so-called "white box" models comprised the following: * Digital Server 3300/3305: rebadged AlphaServer 800 * Digital Server 5300/5305: rebadged AlphaServer 1200 * Digital Server 7300/7305/7310: rebadged AlphaServer 4100 As part of the roadmap to phase out Alpha-, MIPS- and PA-RISC-based systems in favor of Itan ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Metadata
Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords. * Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials. * Administrative metadata – the information to help manage a resource, like resource type, permissions, and when and how it was created. * Reference metadata – the information about the contents and quality of statistical data. * Statistical metadata – also called process data, may describe processes that collect, process, or produce st ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Index (search Engine)
Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is ''web indexing''. Popular engines focus on the full-text indexing of online, natural language documents. Media types such as pictures, video, audio, and graphics are also searchable. Meta search engines reuse the indices of other services and do not store a local index whereas cache-based search engines permanently store the index along with the corpus. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while agent-based search engines inde ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Web Server
A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so. The hardware used to run a web server can vary according to the volume of requests that it needs to handle. At the low end of the range are embedded systems, such as a router that runs a small web server as its configuration interface. A high-traffic Internet website might handle requests with hundreds of servers that run on racks of high-speed computers. A resource sent from a web server can be a preexisting file (static content) available to the web server, or it can be generated ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]