HOME

TheInfoList



OR:

Kaldi is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...
toolkit written in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
for
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...
and
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
, freely available under the Apache License v2.0. Kaldi aims to provide software that is flexible and extensible, and is intended for use by automatic speech recognition (ASR) researchers for building a recognition system. It supports linear transforms, MMI, boosted MMI and MCE discriminative training, feature-space discriminative training, and
deep neural network Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...
s. Kaldi is capable of generating features like mfcc, fbank,
fMLLR In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplica ...
, etc. Hence in recent deep neural network research, a popular usage of Kaldi is to pre-process raw waveform into acoustic feature for end-to-end neural models. Kaldi has been incorporated as part of th
CHiME Speech Separation and Recognition Challenge
over several successive events. The software was initially developed as part of a 2009 workshop at
Johns Hopkins University Johns Hopkins University (Johns Hopkins, Hopkins, or JHU) is a private university, private research university in Baltimore, Maryland. Founded in 1876, Johns Hopkins is the oldest research university in the United States and in the western hem ...
. Kaldi is named after the legendary
Ethiopia Ethiopia, , om, Itiyoophiyaa, so, Itoobiya, ti, ኢትዮጵያ, Ítiyop'iya, aa, Itiyoppiya officially the Federal Democratic Republic of Ethiopia, is a landlocked country in the Horn of Africa. It shares borders with Eritrea to the ...
n
goat herder A goatherd or goatherder is a person who herds goats as a vocational activity. It is similar to a shepherd who herds sheep. Goatherds are most commonly found in regions where goat populations are significant; for instance, in Africa and South A ...
Kaldi Kaldi or Khalid was a legendary Arab Ethiopian goatherd who discovered the coffee plant around 850 CE, according to popular legend, show some artwork depicting him, after which it entered the Islamic world and then the rest of the world. Story I ...
who was said to have discovered the
coffee plant ''Coffea'' is a genus of flowering plants in the family Rubiaceae. ''Coffea'' species are shrubs or small trees native to tropical and southern Africa and tropical Asia. The seeds of some species, called coffee beans, are used to flavor vario ...
.


See also

*
fMLLR In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplica ...
*
List of speech recognition software Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such, grouped in various useful ways. Acoustic models and speech corpus (compilation) The following l ...


References


External links

*
Kaldi
– The official
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
project
How to start with Kaldi and Speech Recognition
- A guide regarding the different parts of the system *Kaldi paper
The Kaldi Speech Recognition ToolkitVOSK
– open source and commercial models from Alpha Cephei on Kaldi foundations Free software projects Computational linguistics Speech recognition software Software using the Apache license {{Comp-ling-stub