
Myanmar is a
Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the ...
containing characters for the
Burmese,
Mon,
Shan,
Palaung, and the
Karen languages of Myanmar, as well as the
Aiton and
Phake languages of Northeast India. It is also used to write Pali and Sanskrit in Myanmar.
Block
The block has sixteen variation sequences defined for
standardized variants. They use (VS01) to denote the dotted letters used for the
Khamti,
Aiton, and
Phake languages. (Note that this is font dependent. For example, the
Padauk font supports some of the dotted forms.)
History
The following Unicode-related documents record the purpose and process of defining specific characters in the Myanmar block:
Historic and nonstandard uses of range
In Unicode 1.0.0, part of the current Myanmar block was
used for Tibetan. In
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
,
collation data referring to the old Tibetan block was retained as late as
Windows XP
Windows XP is a major release of Microsoft's Windows NT operating system. It was released to manufacturing on August 24, 2001, and later to retail on October 25, 2001. It is a direct successor to Windows 2000 for high-end and business users a ...
, and removed in
Windows 2003.
In
Myanmar
Myanmar, officially the Republic of the Union of Myanmar; and also referred to as Burma (the official English name until 1989), is a country in northwest Southeast Asia. It is the largest country by area in Mainland Southeast Asia and has ...
, devices and software localisation often use
Zawgyi fonts rather than Unicode-compliant fonts. These use the same range as the Unicode Myanmar block (0x1000–0x109F), and are even applied to text encoded like
UTF-8
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8.
UTF-8 supports all 1,112,0 ...
(although Zawgyi text does not officially constitute UTF-8), despite only a subset of the code points being interpreted the same way. Zawgyi lacks support for Myanmar-script languages other than Burmese, but heuristic methods exist for detecting the encoding of text which is assumed to be Burmese.
See also
*
Myanmar Extended-A (Unicode block)
*
Myanmar Extended-B (Unicode block)
*
Myanmar Extended-C (Unicode block)
References
{{reflist
Unicode blocks