CuneiForm (software)
   HOME

TheInfoList



OR:

CuneiForm Cognitive OpenOCR is a freely distributed open-source OCR system developed by Russian software company Cognitive Technologies. CuneiForm OCR was developed by Cognitive Technologies as a commercial product in 1993. The system came with the most popular models of scanners, MFPs and software in Russia and the rest of the world: Corel Draw, Hewlet-Packard, Epson, Xerox, Samsung, Brother, Mustek, OKI, Canon, Olivetti, etc.
In 2008 Cognitive Technologies opened the program's source codes.


Features

CuneiForm is a system developed for transforming the electronic copies of paper documents and image files into an editable form without changing the structure and the original document fonts in automatic or semi-automatic mode. The system includes two components for single and batch processing of electronic documents. The list of languages supported by the system: Besides, the system supports a mixture of Russian and English. Recognition of other mixed languages is only supported in the branch, developed by Andrei Borovsky in 2009. Educating the system to recognize other languages is difficult since each language is related to a dat-file, the structure and development method of which are not disclosed by the developers.


History

1993 - Cognitive Technologies signed an OEM-contract with
Corel Cascade Parent Limited, doing business as Alludo (pronounced like "all you do"), is a Canadian software company headquartered in Ottawa, Ontario, specializing in graphics processing. Formerly called the Corel Corporation ( ; from the abbreviatio ...
, under the terms which Cognitive recognition library came embedded into the Corel Draw 3.0 (and later versions) package popular in the publishing sphere. 1994 – The contract with Hewlett-Packard on the equipment of all scanners imported into Russia with CuneiForm OCR. This was the first HP contract with a Russian software company. 1995 - The contract with the Japanese corporation Epson on supplying their scanners with the CuneiForm OCR. The OEM contract was signed with the world's largest manufacturer of fax machines, laser printers, scanners and other office equipment - Brother Corporation. According to the agreement, the new roller scanner Brother IC-150 was equipped with Cognitive software for scanning and recognition worldwide. 1996 - OEM agreement with one of the world's largest manufacturers of monitors, fax machines, laser printers, MFPs and other office equipment - Samsung Information Systems America. According to the agreement the new multifunction device Samsung OFFICE MASTER OML-8630A was to be equipped with the Cognitive Cuneiform LE system of symbol optical recognition worldwide. * OEM agreement with a leading global manufacturer of office equipment Xerox on equipping the multifunctional devices Xerox 3006 and Pro-610 with the CuneiForm recognition system. * CuneiForm '96 OCR release, with the first adaptive recognition algorithms in the world. ''Adaptive Recognition'' - a method based on a combination of two types of printed character recognition algorithms: multifont and omnifont. The system generates an internal font for each input document based on well printed characters using a dynamic adjustment (adaptation) to the specific input symbols. Thus, the method combines the omnitude and the technological efficiency of the omnifont approach with the high font recognition accuracy that dramatically improves the recognition rate. 1997 – The first usage of neural network-based technologies in CuneiForm. The algorithms using neural networks for character recognition are developed as follows: the character image that is to be recognized (pattern) is reduced to a certain standard size (normalized). The luminance values of the normalized pattern are used as input parameters for the neural network. The number of output parameters of the neural network is equal to the number of recognized characters. The result of recognition is a symbol, which corresponds to the maximum value of the output vector of the neural network. * New OEM agreement with Canon equipping multi-function devices imported into Russia with the CuneiForm system; * New OEM contract with OKI Europe Limited on equipping MFPs OKI FAX 4100 and OKI FAX 5200 MFD's, imported into Russia with the CuneiForm system; * The first CuneiForm MMX Update OCR-system for Intel MMX processor release; * NeuHause scanners come with the CuneiForm recognition system; * Russia's first network scanning system CuneiForm 98 NEST release. 1999 *New OEM contract with the Olivetti company on supplying the multi-function devices imported into Russia with the CuneiForm system; *Distribution agreement with a leading European distributor of software company WSKA (France) on the distribution of OCR Cuneiform Direct in Europe; *New version of the system released, Cuneiform 2000, that implements the method of "cognitive analysis TM”: an expert system is integrated into the recognition core, which analyses of alternatives to the estimates on the output from each detection algorithm, and choose the best option. *The method of "Meridian table segmentation TM" is developed for the improvement of the accuracy of recreating the original form of the table in the output document; *The original document form recreation mechanism - "What you scan is what you get TM" is introduced. The technology was aimed at saving the scanned document's original form in terms of its components placement. This particularly important for the documents with complex topology: multicolumn texts with headings, annotations, graphic illustrations, tables, etc. 2001 - OEM-contract with Canon on its scanners and multifunction devices equipment with Cognitive Technologies CuneiForm OCR software for Eastern Europe


Development prospects

*December 12, 2007 OCR CuneiForm
freeware Freeware is software, most often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for t ...
-version was released and the opening of its source was announced. *April 2, 2008 the source codes of the Cuneiform OCR are published under the
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
, and in the fall - the system's interface source texts. *The latest version of OpenSource version for Windows has not been updated since 14.02.2009. This version is no longer available for download. Instead, the version of 11.11.2008 is available on the download page *In 2009 graphical interfaces for the open version of Cuneiform based on Qt 4 library - Cuneiform-Qt, YAGF are released. Starting with version 0.9.0Cuneiform Linux 0.9.0 is released
/ref> open version for Linux can be used as
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
.


See also

* Puma.NET is a wrapper library for Cognitive Technologies CuneiForm recognition engine. It makes it easy to incorporate OCR functionality in any .NET Framework 2.0 (or higher) application.


References


External links


Cognitive OpenOCR, version 11, BSD
{{OCR Free software programmed in C Free software programmed in C++ Optical character recognition Formerly proprietary software MacOS graphics-related software MacOS text-related software Windows graphics-related software Windows text-related software