Windows-1254
   HOME

TheInfoList



OR:

Windows-1254 is a code page used under Microsoft Windows (and for the web), to write Turkish that it was designed for (which is its dominant user, even though it can be used for some other languages too). Characters with codepoints A0 through FF are compatible with ISO 8859-9, but the CR range, which is reserved for
C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
in ISO 8859, is instead used for additional characters (analogous to the relationship between ISO-8859-1 and
Windows-1252 Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. I ...
). The
WHATWG The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, l ...
Encoding Standard, which specifies the character encodings which are permitted in
HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
and which compliant browsers must support, includes Windows-1254, which is used for both the Windows-1254 and ISO-8859-9 labels.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
is preferred for modern applications; authors of new pages and the designers of new protocols are instructed to use
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
instead. , less than 0.05% of all web pages use Windows-1254, and less than 0.06% use ISO-8859-9, which the WHATWG also requires web browsers to handle as Windows-1254. Since 1.9% of all websites located in Turkey use ISO-8859-9, plus the 1.3% that actually declare Windows-1254 used, in effect, 3.2% of websites there use Windows-1254. IBM uses code page 1254 (
CCSID A CCSID (coded character set identifier) is a 16-bit number that represents a particular encoding of a specific code page. For example, Unicode is a code page that has several encoding (so called "transformation") forms, like UTF-8, UTF-16 and U ...
1254 and euro sign extended CCSID 5350) for Windows-1254.


Character set

The following table shows Windows-1254. Each character is shown with its
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
equivalent.


See also

*
Latin script in Unicode Over a thousand characters from the Latin script are encoded in the Unicode Standard, grouped in several basic and extended Latin blocks. The extended ranges contain mainly precomposed letters plus diacritics that are equivalently encoded with co ...
* LMBCS-8 *
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
*
Universal Character Set The Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the international standard ISO/IEC 10646, ''Information technology — Universal Coded Character Set (UCS)'' (plus amendments to that standard), w ...
** European Unicode subset (DIN 91379) *
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
*
Western Latin character sets (computing) Several binary representations of 8-bit character sets for common Western European languages are compared in this article. These encodings were designed for representation of Italian, Spanish, Portuguese, French, German, Dutch, English, Danish, ...
*
Windows-1250 Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Czech (which is its main user with half its use, though Czech has 96.6% use of UTF-8, an ...
*
Windows code page Windows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Wind ...
s * ISO/IEC JTC 1/SC 2


References


External links


Windows 1254 reference chart

IANA Charset Name Registration of windows-1254
{{character encoding Windows code pages