Code page 1256
   HOME

TheInfoList



OR:

Windows-1256 is a code page used under Microsoft Windows to write
Arabic Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walter ...
and other languages that use Arabic script, such as
Persian Persian may refer to: * People and things from Iran, historically called ''Persia'' in the English language ** Persians, the majority ethnic group in Iran, not to be conflated with the Iranic peoples ** Persian language, an Iranian language of the ...
and
Urdu Urdu (;"Urdu"
'' ISO-8859-6 ISO/IEC 8859-6:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ...
nor the MacArabic encoding. Windows-1256 encodes every ''abstract'' single letter of the basic Arabic alphabet, not every concrete visual form of isolated, initial, medial, final or ligatured letter shape variants (i.e. it encodes characters, not glyphs). The Arabic letters in the C0-FF range are in Arabic alphabetic order, but some Latin characters are interspersed among them. These are some
Windows-1252 Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. I ...
Latin characters used for French, since this European language has some historic relevance in former French colonies in North Africa such as
Morocco Morocco (),, ) officially the Kingdom of Morocco, is the westernmost country in the Maghreb region of North Africa. It overlooks the Mediterranean Sea to the north and the Atlantic Ocean to the west, and has land borders with Algeria t ...
and
Algeria ) , image_map = Algeria (centered orthographic projection).svg , map_caption = , image_map2 = , capital = Algiers , coordinates = , largest_city = capital , relig ...
. This allowed French and Arabic text to be intermixed when using Windows 1256 without any need for code-page switching (however, upper-case letters with diacritics were not included). IBM uses code page 1256 (
CCSID A CCSID (coded character set identifier) is a 16-bit number that represents a particular encoding of a specific code page. For example, Unicode is a code page that has several encoding (so called "transformation") forms, like UTF-8, UTF-16 and U ...
1256, euro sign extended CCSID 5352, and the further extended CCSID 9448) for Windows-1256.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
is preferred over Windows 1256 in modern applications, especially on the Internet, where the dominant
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
encoding is most used for web pages, including for Arabic (see also
Arabic script in Unicode Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the ha ...
, for complete coverage, unlike for e.g. Windows 1256 or
ISO-8859-6 ISO/IEC 8859-6:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ...
that do not cover extras). Less than 0.03% of all web pages use Windows-1256 in October 2022, and while that encoding is mostly used for Arabic, and second-most popular for it, it's only used for 1.6% of the Arabic text on the web.


Character set

Since the original code page left 9 values (bytes) marked as "NOT USED" in the original specification, these bytes were used later for additional characters needed for the
Perso-Arabic script The Persian alphabet ( fa, الفبای فارسی, Alefbâye Fârsi) is a writing system that is a version of the Arabic script used for the Persian language spoken in Iran ( Western Persian) and Afghanistan (Dari Persian) since the 7th cen ...
(for the
Persian Persian may refer to: * People and things from Iran, historically called ''Persia'' in the English language ** Persians, the majority ethnic group in Iran, not to be conflated with the Iranic peoples ** Persian language, an Iranian language of the ...
and
Urdu Urdu (;"Urdu"
'' euro sign. The following table shows the extended version of Windows-1256. Each character is shown with its
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
equivalent and its decimal code. Here every Arabic letter is shown in isolated form. The actual forms of the letters inside Arabic words are rendered by a combination of software rules and appropriate font support.


See also

* LMBCS-4


References


External links


Windows 1256 reference chartIANA Charset Name Registration of windows-1256
{{character encoding Windows code pages