VSCII (Vietnamese Standard Code for Information Interchange), also known as TCVN 5712,

ISO-IR ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...

-180, .VN, ABC or simply the TCVN encodings, is a set of three closely related Vietnamese national standard

character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ...

s for using the Vietnamese language with computers, developed by the TCVN Technical Committee on Information Technology (TCVN/TC1) and first adopted in 1993 (as TCVN 5712:1993). It should not be confused with the similarly-named unofficial

VISCII VISCII is an unofficially-defined modified ASCII character encoding for using the Vietnamese language with computers. It should not be confused with the similarly-named officially registered VSCII encoding. VISCII keeps the 95 printable chara ...

encoding, which was sometimes used by overseas Vietnamese speakers. VISCII was also intended to stand for ''Vietnamese Standard Code for Information Interchange'', but is not related to VSCII. VSCII (TCVN) was used extensively in the north of Vietnam, while

VNI VNI Software Company is a developer of various education, entertainment, office, and utility computer software, software packages. They are known for developing an Character encoding, encoding (VNI encoding) and a popular input method (VNI Input) ...

was popular in the south.

Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...

and the

Windows-1258 Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks. Windows-1258 is compatible with neither the Vietnamese standard ( TCVN 5712 / VSCII), nor the various other encodin ...

code page are now used for virtually all Vietnamese computer data, but legacy files or archived messages may need conversion.

Encodings

All three forms of VSCII keep the 95 printable characters of

ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...

unmodified. VSCII-3, also known as TCVN 5712-3, VN3 or simply TCVN3, includes the fewest assignments. It is an extended ASCII, because it keeps all 128 codes of ASCII unmodified. It does not reassign any of the

C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...

. Compared to

, it adds 75 characters: * 67 lowercase characters, allowing full lowercase support. * 7 uppercase characters, allowing uppercase support for the 29 base letters without tone marks. * The non-breaking space. Tone marks on uppercase vowels is accomplished in TCVN3 by switching to an all-capital font. VSCII-2, also known as TCVN 5712-2 and VN2, is a superset of VSCII-3. It is an extended ASCII, because it keeps all 128 codes of ASCII unmodified. It does not reassign any of the

, making it conformant with

ISO 2022 ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...

as a 96-set. Compared to VSCII-3, it adds (for a total of 96 non-ASCII characters): * 16 more uppercase characters with pre-composed tone marks (for a total of 23 non-ASCII uppercase characters) * 5

combining diacritics In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks (including combining accents). Unicode also ...

for

tone marks Tone is the use of pitch in language to distinguish lexical or grammatical meaning – that is, to distinguish or to inflect words. All verbal languages use pitch to express emotional and other paralinguistic information and to convey emph ...

, allowing other combinations of uppercase letters and tone marks to be represented. Combining marks follow the base letter as in

(rather than preceding them as in

ANSEL ANSEL, the American National Standard for Extended Latin Alphabet Coded Character Set for Bibliographic Use, was a character set used in text encoding. It provided a table of coded values for the representation of characters of the extended Latin ...

). VSCII-1, also known as TCVN 5712-1 and VN1, is an extension of VSCII-2, and is a modified ASCII, since it replaces 12 of the 33 control characters with precomposed characters. Compared to VSCII-2, it (for a total of 140 non-ASCII characters): * Adds 44 more pre-composed uppercase letters, bringing them to the same count as the lowercase * Does this by replacing 12 ASCII control characters and allocating 32 graphical characters to the C1 control area, breaking ISO 2022 compatibility Conversion from VSCII-3 to VSCII-2 or VSCII-1 and conversion from VSCII-2 to VSCII-1 are not necessary, but can result in smaller files. Conversion from VSCII-1 to VSCII-2 or VSCII-3 and conversion from VSCII-2 to VSCII-3 require expansion of some pre-composed characters.

Character set

References

External links

Charts on LibrewikiCharts on Charset Wikitables with Unicode points and names
{{Character encodings Character sets Vietnamese writing systems