Tamil Script Code for Information Interchange
   HOME

TheInfoList



OR:

Tamil Script Code for Information Interchange (TSCII) is a coding scheme for representing the
Tamil script The Tamil script ( , ) is an abugida script that is used by Tamils and Tamil language, Tamil speakers in India, Sri Lanka, Malaysia, Singapore, Indonesia and elsewhere to write the Tamil language. Certain minority languages such as Saurasht ...
. The lower 128 codepoints are plain
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
, the upper 128 codepoints are TSCII-specific. After long years of being used on the Internet by private agreement only, it was successfully registered with the
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Interne ...
in 2007. TSCII encodes the characters in visual (written) order, paralleling the use of the Tamil Typewriter.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
has used the logical order encoding strategy for Tamil, following
ISCII Indian Script Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Bengali–Assamese, Devanagar ...
, in contrast to the case of
Thai Thai or THAI may refer to: * Of or from Thailand, a country in Southeast Asia ** Thai people, the dominant ethnic group of Thailand ** Thai language, a Tai-Kadai language spoken mainly in and around Thailand *** Thai script *** Thai (Unicode block ...
, where the visual order encoding grandfathered by
TIS-620 Thai Industrial Standard 620-2533, commonly referred to as TIS-620, is the most common character set and character encoding for the Thai language. The standard is published by the Thai Industrial Standards Institute (TISI), an organ of the Min ...
was adopted. The government of
Tamil Nadu Tamil Nadu (; , TN) is a States and union territories of India, state in southern India. It is the List of states and union territories of India by area, tenth largest Indian state by area and the List of states and union territories of India ...
endorses its own TAB/TAM standards for 8-bit encoding and other, older encoding schemes can still be found on the WWW. The free etext collection a
Project Madurai
uses the TSCII encoding, but has already started to provide
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
versions.


History

The need for a common encoding for Tamil was felt by members of various mailing list based forums in mid-1990s, as there were multiple custom coded fonts were prevalent in those forums. While some of the commercial encodings were popular than the others, they were not accepted by wider community due to conflicting commercial interests. While Unicode was accepted by most as the future standard, most of the desktop systems at that time were still not capable of handling Unicode for Tamil language, and an interim 8-bit encoding was required. A separate mailing list for discussion of such encodings (webmasters@tamil.net) was created in 1997 to initiate this discussion, starting with an email written by Dr.K.Kalyanasundaram to the popular Tamil author Sujatha who headed the committee for standardization of Tamil keyboard. This forum quickly attracted enthusiastic participants from across the globe, including several prominent Tamil scholars. Archives of these discussion are maintained by
INFITT The International Forum for Information Technology in Tamil (INFITT, read ''In-Fit''; Tamil: உலகத் தமிழ் தகவல் தொழில்நுட்ப மன்றம் (உத்தமம்)) is a non-profit, non-governm ...
. Subsequent to publishing TSCII, most of the members of webmasters@tamil.net mailing list became part of INFITT, which is a wider initiative to bring in standardization and continued development in various areas of Tamil computing.


Codepage layout


Conversion Tools

You can convert UTF-8 encoded documents to TSCII using the GNU iconv tools as follows, $ iconv -f utf-8 -t tscii hello.utf8 > hello.tscii Whereas conversion from TSCII to UTF-8 is done by interchanging -f and -t flags.


Visual Application

An open source project is available a
AnyTaFont2UTF8
is maintained by Isaiyini Tamil Community


See also

* TACE16 (Tamil All Character Encoding)


References


External links


TSCII Start Page

Unicode Technical Note #15 Text conversion From TSCII 1.7 to Unicode

INFITT (International Forum for Information Technology in Tamil)

TSCII to Unicode Online & Webpage Conversion

Padma – Mozilla extension for transforming TSCII to Unicode
{{character encoding Tamil character-encoding standards Character sets