HOME

TheInfoList



OR:

The Basic Latin or C0 Controls and Basic Latin
Unicode block A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the ad ...
is the first block of the
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
standard, and the only block which is encoded in one byte in
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
. The block contains all the
letters Letter, letters, or literature may refer to: Characters typeface * Letter (alphabet), a character representing one or more of the sounds used in speech; any of the symbols of an alphabet. * Letterform, the graphic form of a letter of the alphabe ...
and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the
C0 controls The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent non-printable character, additional information about the text, such as t ...
, ASCII
punctuation Punctuation (or sometimes interpunction) is the use of spacing, conventional signs (called punctuation marks), and certain typographical devices as aids to the understanding and correct reading of written text, whether read silently or aloud. An ...
and
symbol A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different conc ...
s,
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
digits, both the
uppercase Letter case is the distinction between the Letter (alphabet), letters that are in larger uppercase or capitals (or more formally ''majuscule'') and smaller lowercase (or more formally ''minuscule'') in the written representation of certain lang ...
and
lowercase Letter case is the distinction between the Letter (alphabet), letters that are in larger uppercase or capitals (or more formally ''majuscule'') and smaller lowercase (or more formally ''minuscule'') in the written representation of certain lang ...
of the
English alphabet The alphabet for Modern English is a Latin-script alphabet consisting of 26 letters, each having an upper- and lower-case form. The word ''alphabet'' is a compound of the first two letters of the Greek alphabet, '' alpha'' and '' beta''. ...
and a
control character In computing and telecommunication, a control Character (computing), character or non-printing character (NPC) is a code point (a number) in a character encoding, character set, that does not represent a written symbol. They are used as in-band ...
. The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire. Its block name in Unicode 1.0 was ASCII.


Table of characters

: The letter U+005C (\) may show up as a Yen(¥) or Won(₩) sign in Japanese/Korean fonts mistaking Unicode (especially
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
) as a legacy character set which replaced the backslash with these signs.


Subheadings

The C0 Controls and Basic Latin block contains six subheadings.


C0 controls

The
C0 Controls The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent non-printable character, additional information about the text, such as t ...
, referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the ISO/IEC 6429:1992 standard.


ASCII punctuation and symbols

This subheading refers to standard punctuation characters, simple
mathematical operators Mathematical Operators is a Unicode block containing characters for mathematical, logical, and set notation. Notably absent are the plus sign (+), greater than sign (>) and less than sign (<), due to them already appearing in the Basi ...
, and symbols like the dollar sign, percent, ampersand, underscore, and pipe.


ASCII digits

The ASCII Digits subheading contains the standard European number characters 1–9 and 0.


Uppercase Latin alphabet

The Uppercase Latin alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the
majuscule Letter case is the distinction between the letters that are in larger uppercase or capitals (or more formally ''majuscule'') and smaller lowercase (or more formally ''minuscule'') in the written representation of certain languages. The writing ...
.


Lowercase Latin alphabet

The Lowercase Latin Alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the
minuscule Letter case is the distinction between the letters that are in larger uppercase or capitals (or more formally ''majuscule'') and smaller lowercase (or more formally ''minuscule'') in the written representation of certain languages. The writing ...
.


Control character

The Control Character subheading contains the "Delete" character.


Number of symbols, letters and control codes

The table below shows the number of
letter Letter, letters, or literature may refer to: Characters typeface * Letter (alphabet), a character representing one or more of the sounds used in speech; any of the symbols of an alphabet. * Letterform, the graphic form of a letter of the alphabe ...
s, symbols and control codes in each of the subheadings in the C0 Controls and Basic Latin block.


Chart


Variants

Several of the characters are defined to render as a standardized variant if followed by variant indicators. A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1 (0︀). Twelve characters (#, *, and the digits) can be followed by U+FE0E VS15 or U+FE0F VS16 to create
emoji An emoji ( ; plural emoji or emojis) is a pictogram, logogram, ideogram or smiley embedded in text and used in electronic messages and web pages. The primary function of emoji is to fill in emotional cues otherwise missing from typed conversat ...
variants. They are
keycap A keycap is a small cover of plastic, metal, or other material placed over the keyswitch of a computer keyboard. Keycaps are often illustrated to indicate the key function or alphanumeric Alphanumericals or alphanumeric characters are a co ...
base characters, for example #️⃣ (U+0023 NUMBER SIGN U+FE0F VS16 U+20E3 COMBINING ENCLOSING KEYCAP). The VS15 version is "text presentation" while the VS16 version is "emoji-style".


History

The following Unicode-related documents record the purpose and process of defining specific characters in the Basic Latin block:


See also

*
Character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that ...
* ISO 8859-1


References

{{Unicode navigation Latin-script Unicode blocks Unicode blocks