Windows 1256
   HOME

TheInfoList



OR:

Windows-1256 is a
code page In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte. (In some co ...
used under
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
to write
Arabic Arabic (, ' ; , ' or ) is a Semitic languages, Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C ...
and other languages that use
Arabic script The Arabic script is the writing system used for Arabic and several other languages of Asia and Africa. It is the second-most widely used writing system in the world by number of countries using it or a script directly derived from it, and the ...
, such as
Persian Persian may refer to: * People and things from Iran, historically called ''Persia'' in the English language ** Persians, the majority ethnic group in Iran, not to be conflated with the Iranic peoples ** Persian language, an Iranian language of the ...
and
Urdu Urdu (;"Urdu"
''
ISO-8859-6 nor the MacArabic encoding. Windows-1256 encodes every ''abstract'' single letter of the basic Arabic alphabet, not every concrete visual form of isolated, initial, medial, final or ligatured letter shape variants (i.e. it encodes characters, not glyphs). The Arabic letters in the C0-FF range are in Arabic alphabetic order, but some Latin characters are interspersed among them. These are some
Windows-1252 Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. It ...
Latin characters used for
French French (french: français(e), link=no) may refer to: * Something of, from, or related to France ** French language, which originated in France, and its various dialects and accents ** French people, a nation and ethnic group identified with Franc ...
, since this European language has some historic relevance in former French colonies in North Africa such as
Morocco Morocco (),, ) officially the Kingdom of Morocco, is the westernmost country in the Maghreb region of North Africa. It overlooks the Mediterranean Sea to the north and the Atlantic Ocean to the west, and has land borders with Algeria to ...
and
Algeria ) , image_map = Algeria (centered orthographic projection).svg , map_caption = , image_map2 = , capital = Algiers , coordinates = , largest_city = capital , relig ...
. This allowed French and Arabic text to be intermixed when using Windows 1256 without any need for code-page switching (however, upper-case letters with diacritics were not included). IBM uses code page 1256 ( CCSID 1256,
euro sign The euro sign () is the currency sign used for the euro, the official currency of the eurozone and unilaterally adopted by Kosovo and Montenegro. The design was presented to the public by the European Commission on 12 December 1996. It consists ...
extended CCSID 5352, and the further extended CCSID 9448) for Windows-1256.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
is preferred over Windows 1256 in modern applications, especially on the Internet, where the dominant
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
encoding is most used for web pages, including for Arabic (see also
Arabic script in Unicode Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the hand ...
, for complete coverage, unlike for e.g. Windows 1256 or ISO-8859-6 that do not cover extras). Less than 0.03% of all web pages use Windows-1256 in October 2022, and while that encoding is mostly used for Arabic, and second-most popular for it, it's only used for 1.6% of the Arabic text on the web.


Character set

Since the original
code page In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte. (In some co ...
left 9 values (bytes) marked as "NOT USED" in the original specification, these bytes were used later for additional characters needed for the
Perso-Arabic script The Persian alphabet ( fa, الفبای فارسی, Alefbâye Fârsi) is a writing system that is a version of the Arabic script used for the Persian language spoken in Iran ( Western Persian) and Afghanistan (Dari Persian) since the 7th cent ...
(for the
Persian Persian may refer to: * People and things from Iran, historically called ''Persia'' in the English language ** Persians, the majority ethnic group in Iran, not to be conflated with the Iranic peoples ** Persian language, an Iranian language of the ...
and
Urdu Urdu (;"Urdu"
''
euro sign The euro sign () is the currency sign used for the euro, the official currency of the eurozone and unilaterally adopted by Kosovo and Montenegro. The design was presented to the public by the European Commission on 12 December 1996. It consists ...
. The following table shows the extended version of Windows-1256. Each character is shown with its
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
equivalent and its decimal code. Here every Arabic letter is shown in isolated form. The actual forms of the letters inside Arabic words are rendered by a combination of software rules and appropriate font support.


See also

* LMBCS-4


References


External links


Windows 1256 reference chartIANA Charset Name Registration of windows-1256
{{character encoding Windows code pages