CereProc
   HOME

TheInfoList



OR:

CereProc ( ) is a
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
company based in
Edinburgh Edinburgh ( ; gd, Dùn Èideann ) is the capital city of Scotland and one of its 32 Council areas of Scotland, council areas. Historically part of the county of Midlothian (interchangeably Edinburghshire before 1921), it is located in Lothian ...
, Scotland, founded in 2005. The company specialises in creating natural and expressive-sounding
text to speech Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or Computer hardware, hardware products. A text-to-speech (TTS) system conve ...
voices, synthesis voices with regional accents, and in
voice cloning Digital cloning is an emerging technology, that involves deep-learning algorithms, which allows one to manipulate currently existing Sound, audio, Photograph, photos, and videos that are hyper-realistic. One of the impacts of such technology is t ...
.


Voice building technology

CereProc creates voices using two different voice-building technologies: unit selection synthesis and parametric modelling. CereProc's unit selection voices are built from large
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s of recorded speech. During database creation, each recorded utterance is segmented into some or all of the following: individual
phones A telephone is a telecommunications device that permits two or more users to conduct a conversation when they are too far apart to be easily heard directly. A telephone converts sound, typically and most efficiently the human voice, into ele ...
,
syllable A syllable is a unit of organization for a sequence of speech sounds typically made up of a syllable nucleus (most often a vowel) with optional initial and final margins (typically, consonants). Syllables are often considered the phonological "bu ...
s,
morpheme A morpheme is the smallest meaningful Constituent (linguistics), constituent of a linguistic expression. The field of linguistics, linguistic study dedicated to morphemes is called morphology (linguistics), morphology. In English, morphemes are ...
s,
word A word is a basic element of language that carries an semantics, objective or pragmatics, practical semantics, meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of w ...
s,
phrase In syntax and grammar, a phrase is a group of words or singular word acting as a grammatical unit. For instance, the English expression "the very happy squirrel" is a noun phrase which contains the adjective phrase "very happy". Phrases can consi ...
s, and
sentences ''The Four Books of Sentences'' (''Libri Quattuor Sententiarum'') is a book of theology written by Peter Lombard in the 12th century. It is a systematic compilation of theology, written around 1150; it derives its name from the ''sententiae'' o ...
. The division into segments is done using a specially modified speech recogniser. An
index Index (or its plural form indices) may refer to: Arts, entertainment, and media Fictional entities * Index (''A Certain Magical Index''), a character in the light novel series ''A Certain Magical Index'' * The Index, an item on a Halo megastru ...
of the units in the speech database is then created based on the segmentation and acoustic parameters like the
fundamental frequency The fundamental frequency, often referred to simply as the ''fundamental'', is defined as the lowest frequency of a periodic waveform. In music, the fundamental is the musical pitch of a note that is perceived as the lowest partial present. In ...
( pitch), duration, position in the syllable, and neighbouring phones. At runtime, the desired target utterance is created by determining the best chain of candidate units from the database (unit selection). Unit selection provides the greatest naturalness, because it applies
digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
(DSP) to the recorded speech only at concatenation points. DSP often makes recorded speech sound less natural. CereProc's parametric voices produce speech synthesis based on statistical modelling methodologies. In this system, the
frequency spectrum The power spectrum S_(f) of a time series x(t) describes the distribution of power into frequency components composing that signal. According to Fourier analysis, any physical signal can be decomposed into a number of discrete frequencies, ...
(
vocal tract The vocal tract is the cavity in human bodies and in animals where the sound produced at the sound source (larynx in mammals; syrinx (biology), syrinx in birds) is filtered. In birds it consists of the Vertebrate trachea, trachea, the Syrinx (bio ...
),
fundamental frequency The fundamental frequency, often referred to simply as the ''fundamental'', is defined as the lowest frequency of a periodic waveform. In music, the fundamental is the musical pitch of a note that is perceived as the lowest partial present. In ...
(vocal source), and duration ( prosody) of speech are modelled simultaneously. Speech
waveforms In electronics, acoustics, and related fields, the waveform of a signal is the shape of its graph as a function of time, independent of its time and magnitude scales and of any displacement in time.David Crecraft, David Gorham, ''Electronics ...
are generated from these parameters using a
vocoder A vocoder (, a portmanteau of ''voice'' and ''encoder'') is a category of speech coding that analyzes and synthesizes the human voice signal for audio data compression, multiplexing, voice encryption or voice transformation. The vocoder was ...
. Critically, these voices can be built from significantly less recorded speech than unit selection voices and have a much smaller footprint when installed, because of this they are used for private voice cloning.


Voices and languages

CereProc has 58 generally available voices that speak 23 languages in a number of different regional accents: *American English: Isabella, Katherine, Hannah, Megan, Adam, Nathan, Andy (child voice), Jordan (child voice), Carolyn, Sam (Non-binary) *Southern English: Sarah, William, Jack, Lauren, Giles, Amy, Lily (child voice) *Northern English: Jess *Scottish English: Heather, Kirsty, Stuart, Andrew (child voice), Mairi (child voice) *Glasgow English: Dodo *Lancashire English: Claire *Irish English: Caitlin *West Midlands English: Sue *Special FX voices: Demon, Ghost, Goblin, Pixie, Robot *Metropolitan French: Suzanne, Laurent *Canadian French: Florence *Catalan: Rita *Castilian Spanish: Sara *Mexican Spanish: Ana *Italian: Laura, Dario, Francesco (child voice), Nicoletta (child voice) *Irish: Peig *Dutch: Ada *Standard German: Gudrun, Alex *Austrian German: Leopold *European Portuguese: Lúcia *Brazilian Portuguese: Gabriel *Japanese: Yuki *Scottish Gaelic: Ceitidh *Swedish: Ylva *Polish: Pola *Romanian: Daria *French-accented English: Nicole *Russian: Avrora *Mandarin: Mailin *Danish: Marie *Norwegian (Bokmål): Clara *Norwegian (Nynorsk): Hulda *Lithuanian: Mantas, Egle In addition, the company has developed a number of celebrity voices that are not generally available to the public. These include
George W. Bush George Walker Bush (born July 6, 1946) is an American politician who served as the 43rd president of the United States from 2001 to 2009. A member of the Republican Party, Bush family, and son of the 41st president George H. W. Bush, he ...
,
Barack Obama Barack Hussein Obama II ( ; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, Obama was the first African-American president of the U ...
and
Arnold Schwarzenegger Arnold Alois Schwarzenegger (born July 30, 1947) is an Austrian and American actor, film producer, businessman, retired professional bodybuilder and politician who served as the 38th governor of California between 2003 and 2011. ''Time'' ...
.


Voice cloning

In 2009, film critic
Roger Ebert Roger Joseph Ebert (; June 18, 1942 – April 4, 2013) was an American film critic, film historian, journalist, screenwriter, and author. He was a film critic for the ''Chicago Sun-Times'' from 1967 until his death in 2013. In 1975, Ebert beca ...
employed CereProc to create a synthetic version of his voice. Ebert had lost the power of speech following surgery to treat
thyroid cancer Thyroid cancer is cancer that develops from the tissues of the thyroid gland. It is a disease in which cells grow abnormally and have the potential to spread to other parts of the body. Symptoms can include swelling or a lump in the neck. C ...
. CereProc mined tapes and DVD commentaries featuring Ebert's voice to create a text-to-speech voice that sounded more like his own. Roger Ebert used the voice in his March 2, 2010, appearance on ''
The Oprah Winfrey Show ''The Oprah Winfrey Show'', often referred to as ''The Oprah Show'' or simply ''Oprah'', is an American daytime broadcast syndication, syndicated talk show that aired nationally for 25 seasons from September 8, 1986, to May 25, 2011, in Chicag ...
''. NFL player
Steve Gleason Stephen Michael "Steve" Gleason (born March 19, 1977) is a former professional American football Safety (gridiron football position), safety with the New Orleans Saints of the National Football League (NFL). Originally signed by the Indianapolis ...
had his voice cloned by CereProc following his diagnosis with MND. Gleason appeared in
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
's
Super Bowl XLVIII Super Bowl XLVIII was an American football game between the American Football Conference (AFC) champion Denver Broncos and National Football Conference (NFC) champion Seattle Seahawks to decide the National Football League (NFL) champion for th ...
commercial praising the power of technology, using his synthetic voice to narrate. CereProc voice cloning technology is currently being used in the UK by people with MND, to create synthesis voices before they lose the power of speech. This process was featured in a
BBC Radio 4 BBC Radio 4 is a British national radio station owned and operated by the BBC that replaced the BBC Home Service in 1967. It broadcasts a wide variety of spoken-word programmes, including news, drama, comedy, science and history from the BBC' ...
documentary, ''Giving the Critic Back His Voice'', broadcast in August 2011."Giving the Critic Back His Voice"
BBC #REDIRECT BBC #REDIRECT BBC #REDIRECT BBC Here i going to introduce about the best teacher of my life b BALAJI sir. He is the precious gift that I got befor 2yrs . How has helped and thought all the concept and made my success in the 10th board ex ...
Radio Scotland Programmes. Retrieved October 26, 2011.


System compatibility

CereProc voices can be deployed on different
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
s and on different types of devices. CereProc desktop voices are compatible with
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
and Apple Mac
OS X macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
. They install as system voices and are able to be used by other speech-enabled applications. CereProc's client/server system cServer, aimed principally at the corporate IVR market, can be run on Windows and
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
. CereProc Mobile voices can be deployed on
Android Android may refer to: Science and technology * Android (robot), a humanoid robot or synthetic organism designed to imitate a human * Android (operating system), Google's mobile operating system ** Bugdroid, a Google mascot sometimes referred to ...
and Apple
iOS iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also includes ...
. The SDK is available for Android, Linux, MacOS, iOS, and Windows. The SDK has bindings for C/C++, C#, Java, and Python.


See also

*
Language Language is a structured system of communication. The structure of a language is its grammar and the free components are its vocabulary. Languages are the primary means by which humans communicate, and may be conveyed through a variety of met ...
*
Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...
*
Speech processing Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied t ...
*
List of screen readers Current screen readers Unfinished screenreader projects Discontinued and/or obsoleted screen readers Software aids for people with reading difficulties * Automatik Text Reader from Davide Baldini (Firefox extension) * BrowseAloud from ...


References


External links

*
Roger Ebert demonstrates his CereProc voice
at TED2011 at 7:28 {{DEFAULTSORT:CereProc Computer accessibility Speech synthesis Multimedia software 2005 software 2005 establishments in Scotland