Ocrad
   HOME

TheInfoList



OR:

Ocrad is an
optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
program and part of the
GNU Project The GNU Project () is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and computing devices by collaborat ...
. It is
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
licensed A license (or licence) is an official permission or permit to do, use, or own something (as well as the document of that permission or permit). A license is granted by a party (licensor) to another party (licensee) as an element of an agreeme ...
under the GNU GPL. Based on a
feature extraction In machine learning, pattern recognition, and image processing, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning a ...
method, it reads images in
portable pixmap Netpbm (formerly Pbmplus) is an open-source package of graphics programs and a programming library. It is used mainly in the Unix world, where one can find it included in all major open-source operating system distributions, but also works on Micr ...
formats known as Portable anymap and produces text in byte (8-bit) or
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
formats. Also included is a layout analyser, able to separate the columns or blocks of text normally found on printed pages.


User interface

Ocrad can be used as a stand-alone
command-line A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
application or as a back-end to other programs. Kooka, which was the
KDE KDE is an international free software community that develops free and open-source software. As a central development hub, it provides tools and resources that allow collaborative work on this kind of software. Well-known products include the ...
environment's default scanning application until KDE 4, can use Ocrad as its OCR engine. Since conversion to newer Qt versions, current versions of KDE no longer contain Kooka; development continues in the KDE git repository. Ocrad can be also used as an OCR engine in
OCRFeeder OCRFeeder is an optical character recognition suite for GNOME, which also supports virtually any command-line OCR engine, such as CuneiForm, GOCR, Ocrad and Tesseract. It converts paper documents to digital document files and can serve to make t ...
.


History

Ocrad has been developed by Antonio Diaz Diaz since 2003. Version 0.7 was released in February 2004, 0.14 in February 2006 and 0.18 in May 2009. It is written in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
. Archives of the bug-ocrad mailing list go back to October 2003.


Notes


References


External links


Ocrad GNU Project HomepagePeter Selinger's Review of Linux OCR software (2007)Andreas Gohr Linux OCR Software Comparison (2010)Online OCR server powered by OcradTesseract & Ocrad comparison, Linux Journal (2007)
{{OCR Free software programmed in C++ GNU Project software Optical character recognition 2003 software