MBROLA
   HOME

TheInfoList



OR:

MBROLA is
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
as a worldwide collaborative project. The MBROLA project web page provides diphone databases for many spoken
language Language is a structured system of communication. The structure of a language is its grammar and the free components are its vocabulary. Languages are the primary means by which humans communicate, and may be conveyed through a variety of met ...
s. The MBROLA software is not a complete
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
system for all those languages; the
text Text may refer to: Written word * Text (literary theory), any object that can be read, including: **Religious text, a writing that a religious tradition considers to be sacred **Text, a verse or passage from scripture used in expository preachin ...
must first be transformed into
phoneme In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-west o ...
and
prosodic In linguistics, prosody () is concerned with elements of speech that are not individual phonetic segments (vowels and consonants) but are properties of syllables and larger units of speech, including linguistic functions such as intonation, str ...
information in MBROLA's format, and separate software (e.g. eSpeakNG) is necessary.


History

MBROLA project started in 1995 at TCTS Lab of the Faculté polytechnique de Mons (Belgium) as a scientific project to obtain a set of speech synthesizers for as many languages as possible. First release of MBROLA software was in 1996 and was provided as
freeware Freeware is software, most often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for the f ...
for non-commercial, non-military application. Licenses for created voice databases differ, but are also mostly for non-commercial and non-military use. Due to its free usage only for non-commercial applications, MBROLA was as alternative choice for private/home users for de facto
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
engine eSpeakNG in
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
workstations, but mostly was not used for commercial solutions (e.g. for speaking time clocks, boarding notifications for ports and terminals etc.) After initial development of voice databases updates and support of MBROLA software ceased and gradually closed-source binaries fell behind development of recent hardware and operating systems. To deal with this MBROLA development team decided to release MBROLA as
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
software, and on October 24, 2018, source code was released on GitHub with
GNU Affero General Public License The GNU Affero General Public License (GNU AGPL) is a free, copyleft license published by the Free Software Foundation in November 2007, and based on the GNU General Public License, version 3 and the Affero General Public License. The Free So ...
. On January 23, 2019, tool called MBROLATOR was released to provide creation of MBROLA database from
WAV Waveform Audio File Format (WAVE, or WAV due to its filename extension; pronounced "wave") is an audio file format standard, developed by IBM and Microsoft, for storing an audio bitstream on PCs. It is the main format used on Microsoft Wind ...
files with the same license.


Used technology

MBROLA software uses MBROLA (Multi-Band Resynthesis OverLap Add)
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
for speech generation. Although it is
diphone In phonetics, a diphone is an adjacent pair of phones in an utterance. For example, in aɪfəʊn the diphones are a ɪ ªf É™ ™ÊŠ Šn The term is usually used to refer to a recording of the transition between two phones. In the following d ...
-based, the quality of MBROLA's synthesis is considered to be higher than that of most diphone synthesisers as it preprocesses the diphones imposing constant pitch and
harmonic A harmonic is a wave with a frequency that is a positive integer multiple of the ''fundamental frequency'', the frequency of the original periodic signal, such as a sinusoidal wave. The original signal is also called the ''1st harmonic'', the ...
phases that enhances their concatenation while only slightly degrading their segmental quality. MBROLA is a time-domain algorithm similar to
PSOLA PSOLA (Pitch Synchronous Overlap and Add) is a digital signal processing technique used for speech processing and more specifically speech synthesis. It can be used to modify the pitch and duration of a speech signal. It was invented around 198 ...
, which implies very low computational load at synthesis time. Unlike PSOLA, however, MBROLA does not require a preliminary marking of pitch periods. This feature has made it possible to develop the MBROLA project around the MBROLA algorithm, through which many speech research labs,
companies A company, abbreviated as co., is a legal entity representing an association of people, whether natural, legal or a mixture of both, with a specific objective. Company members share a common purpose and unite to achieve specific, declared go ...
, or
individual An individual is that which exists as a distinct entity. Individuality (or self-hood) is the state or quality of being an individual; particularly (in the case of humans) of being a person unique from other people and possessing one's own Maslow ...
s around the world have provided diphone
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s for many languages and voices.


References


MBROLA source code repository



External links


MBROLA voices (database for MBROLA speech synthesizer)

MBROLATOR (database creation tool for MBROLA speech synthesizer)
{{Speech synthesis Speech synthesis Computational linguistics