HOME

TheInfoList



OR:

The Medical Intelligence and Language Engineering Laboratory, also known as MILE lab, is a research laboratory at the Indian Institute of Science,
Bangalore Bangalore (), officially Bengaluru (), is the capital and largest city of the Indian state of Karnataka. It has a population of more than and a metropolitan population of around , making it the third most populous city and fifth most ...
under the Department of Electrical Engineering. The lab is known for its work on
Image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensiona ...
, online handwriting recognition,
Text-To-Speech Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
and
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scen ...
systems, all of which are focused mainly on
document A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" or ...
s and
speech Speech is a human vocal communication using language. Each language uses Phonetics, phonetic combinations of vowel and consonant sounds that form the sound of its words (that is, all English words sound different from all French words, even if ...
in Indian languages. The lab is headed by A. G. Ramakrishnan.


Research focus

One of the commitments of MILE lab is the development of
technology Technology is the application of knowledge to reach practical goals in a specifiable and reproducible way. The word ''technology'' may also mean the product of such an endeavor. The use of technology is widely prevalent in medicine, science, ...
for people with visual impairment to harness knowledge from any available printed material in Indian languages. The lab is working towards reaching this goal. Its work till now included:
document mosaicing Document mosaicing is a process that image stitching, stitches multiple, overlapping wikt:snaphot, snapshot images of a document together to produce one large, high resolution composite. The document is slid under a stationary, over-the-desk camera ...
of coloured, camera captured images ; text extraction from complex colour images, including camera captured images; document layout analysis; detection of broken and merged characters; OCR technology for Tamil and Kannada; text to speech conversion in
Tamil Tamil may refer to: * Tamils, an ethnic group native to India and some other parts of Asia ** Sri Lankan Tamils, Tamil people native to Sri Lanka also called ilankai tamils **Tamil Malaysians, Tamil people native to Malaysia * Tamil language, nati ...
and
Kannada Kannada (; ಕನ್ನಡ, ), originally romanised Canarese, is a Dravidian language spoken predominantly by the people of Karnataka in southwestern India, with minorities in all neighbouring states. It has around 47 million native s ...
; pitch modification using discrete cosine transform in the source domain; automated
part of speech In grammar, a part of speech or part-of-speech (abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are assi ...
tagging; phrase prediction and prosody modeling. Mozhi Vallan, the Tamil OCR product developed by MILE Lab, is being used by Worth Trust and Karna Vidya Technology Centre, Chennai for the conversion of printed school and college books to
Braille Braille (Pronounced: ) is a tactile writing system used by people who are visually impaired, including people who are Blindness, blind, Deafblindness, deafblind or who have low vision. It can be read either on Paper embossing, embossed paper ...
format.
Sri Ramakrishna Math, Chennai Sri Ramakrishna Math, Chennai is a monastic organisation for men brought into existence by Ramakrishna (1836–1886), a 19th-century saint of Bengal. The motto of the Ramakrishna Math and Ramakrishna Mission is: "For one's own salvation ...
is using it to convert their printed philosophical books in Tamil to computer readable text. Lipi Gnani, the Kannada OCR developed by MILE Lab is being used by Braille Transcription Centers of Mitrajyothi and Canara Bank Relief & Welfare Society, Bangalore for similar purposes. Also, Thirukkural, the Tamil TTS system developed by MILE Lab is being used by some school teachers in Singapore for assignments. Madhura, the Kannada TTS developed by the lab, is being used by two blind students, integrated with a
screen reader A screen reader is a form of assistive technology (AT) that renders text and image content as speech or braille output. Screen readers are essential to people who are blindness, blind, and are useful to people who are visual impairment, visually ...
, to read aloud text OCR'ed with Lipi Gnani from Kannada books. Currently, the lab is researching on
machine listening Computer audition (CA) or machine listening is the general field of study of algorithms and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition a ...
and a novel temporal feature named as plosion index has been proposed, which has been shown to be extremely effective in detecting closure-burst transitions of stop consonants and
affricates An affricate is a consonant that begins as a stop and releases as a fricative, generally with the same place of articulation (most often coronal). It is often difficult to decide if a stop and fricative form a single phoneme or a consonant pair. ...
from continuous speech, even in
noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference arise ...
. Another feature proposed is DCTILPR, which is a voice source based feature vector that improves the recognition performance of a
speaker identification Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...
system. In the early days, significant work was carried out in medical signal and image processing. A unique algorithm was proposed for
ECG Electrocardiography is the process of producing an electrocardiogram (ECG or EKG), a recording of the heart's electrical activity. It is an electrogram of the heart which is a graph of voltage versus time of the electrical activity of the hear ...
compression by treating each
cardiac cycle The cardiac cycle is the performance of the human heart from the beginning of one heartbeat to the beginning of the next. It consists of two periods: one during which the heart muscle relaxes and refills with blood, called diastole, following ...
as a vector, and applying linear prediction on the
discrete wavelet transform In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal ...
of this vector, after normalizing its period using multirate processing based
interpolation In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points. In engineering and science, one often has a n ...
. The maturity of the fetal
lung The lungs are the primary organs of the respiratory system in humans and most other animals, including some snails and a small number of fish. In mammals and most other vertebrates, two lungs are located near the backbone on either side of t ...
was predicted using
image texture An image texture is a set of metrics calculated in image processing designed to quantify the perceived texture of an image. Image texture gives us information about the spatial arrangement of color or intensities in an image or selected region ...
features obtained from the
liver The liver is a major Organ (anatomy), organ only found in vertebrates which performs many essential biological functions such as detoxification of the organism, and the Protein biosynthesis, synthesis of proteins and biochemicals necessary for ...
and lung regions of the
ultrasound Ultrasound is sound waves with frequency, frequencies higher than the upper audible limit of human hearing range, hearing. Ultrasound is not different from "normal" (audible) sound in its physical properties, except that humans cannot hea ...
images obtained from pregnant women An effective technique was proposed for
lossless compression Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistic ...
of 3D
magnetic resonance image Magnetic resonance imaging (MRI) is a medical imaging technique used in radiology to form pictures of the anatomy and the physiological processes of the body. MRI scanners use strong magnetic fields, magnetic field gradients, and radio waves ...
s of the
brain A brain is an organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It is located in the head, usually close to the sensory organs for senses such as vision. It is the most complex organ in a v ...
. Each MRI slice was represented by uniform or adaptive mesh;
affine transformation In Euclidean geometry, an affine transformation or affinity (from the Latin, ''affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More generally, ...
was applied between the corresponding mesh elements of adjacent slices and context-based
entropy coding In information theory, an entropy coding (or entropy encoding) is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method ...
, on the residues.


References


External links


Department of Electrical Engineering, IIScMedical Intelligence and Language Engineering laboratory
{{DEFAULTSORT:Medical Intelligence and Language Engineering Lab Electrical engineering organizations Indian Institute of Science Research in India