In the
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
standard, a plane is a contiguous group of 65,536 (2
16)
code point
A code point, codepoint or code position is a particular position in a Table (database), table, where the position has been assigned a meaning. The table may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dime ...
s. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–10
16 of the first two positions in six position
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
format (U+''hhhhhh''). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version , five of the planes have assigned code points (characters), and seven are named.
The limit of 17 planes is due to
UTF-16, which can encode 2
20 code points (16 planes) as pairs of
words, plus the BMP as a single word.
UTF-8
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8.
UTF-8 supports all 1,112,0 ...
was designed with a much larger limit of 2
31 (2,147,483,648) code points (32,768 planes), and would still be able to encode 2
21 (2,097,152) code points (32 planes) even under the current limit of 4
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
s.
The 17 planes can accommodate 1,114,112 code points. Of these, 2,048 are
surrogates (used to make the pairs in UTF-16), 66 are
non-characters, and 137,468 are
reserved for private use, leaving 974,530 for public assignment.
Planes are further subdivided into
Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the ...
s, which, unlike planes, do not have a fixed size. The 338 blocks defined in Unicode cover 27% of the possible code point space, and range in size from a minimum of 16 code points (sixteen blocks) to a maximum of 65,536 code points (Supplementary Private Use Area-A and -B, which constitute the entirety of planes 15 and 16). For future usage, ranges of characters have been tentatively mapped out for most known current and ancient writing systems.
Overview
Assigned characters
Basic Multilingual Plane
The first plane, plane 0, the Basic Multilingual Plane (BMP), contains characters for almost all modern languages, and a large number of
symbols
A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise different concep ...
. A primary objective for the BMP is to support the unification of prior character sets as well as characters for
writing
Writing is the act of creating a persistent representation of language. A writing system includes a particular set of symbols called a ''script'', as well as the rules by which they encode a particular spoken language. Every written language ...
. Most of the assigned code points in the BMP are used to encode Chinese, Japanese, and Korean (
CJK) characters.
The High Surrogate (
U+D800–U+DBFF) and Low Surrogate (
U+DC00–U+DFFF) codes are reserved for
encoding non-BMP characters in UTF-16 by using a ''pair'' of 16-
bit codes: one High Surrogate and one Low Surrogate. A single surrogate code point will never be assigned a character.
65,520 of the 65,536 code points in this plane have been allocated to a Unicode block, leaving just 16 code points in a single unallocated range (2FE0..2FEF).
, the BMP comprises the following 164 blocks:
* Alphabetic left-to-right scripts:
**
Basic Latin (Lower half of
ISO/IEC 8859-1:
ISO/IEC 646:1991-IRV aka
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
) (0000–007F)
**
Latin-1 Supplement (Upper half of
ISO/IEC 8859-1) (0080–00FF)
**
Latin Extended-A
Latin Extended-A is a Unicode block and is the third block of the Unicode standard. It encodes Latin letters from the Latin ISO character sets other than Latin-1 (which is already encoded in the Latin-1 Supplement block) and also legacy characte ...
(0100–017F)
**
Latin Extended-B (0180–024F)
**
IPA Extensions
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former and proposed IPA signs ...
(0250–02AF)
**
Spacing Modifier Letters (02B0–02FF)
**
Combining Diacritical Marks
Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the character " Combining Grapheme Joiner", which prevents canonical reordering of combining characters, and despite the name, actua ...
(0300–036F)
**
Greek and Coptic (0370–03FF)
**
Cyrillic
The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Ea ...
(0400–04FF)
**
Cyrillic Supplement (0500–052F)
**
Armenian
Armenian may refer to:
* Something of, from, or related to Armenia, a country in the South Caucasus region of Eurasia
* Armenians, the national people of Armenia, or people of Armenian descent
** Armenian diaspora, Armenian communities around the ...
(0530–058F)
*
Semitic abjads and other right-to-left scripts:
**
Hebrew
Hebrew (; ''ʿÎbrit'') is a Northwest Semitic languages, Northwest Semitic language within the Afroasiatic languages, Afroasiatic language family. A regional dialect of the Canaanite languages, it was natively spoken by the Israelites and ...
(0590–05FF)
**
Arabic
Arabic (, , or , ) is a Central Semitic languages, Central Semitic language of the Afroasiatic languages, Afroasiatic language family spoken primarily in the Arab world. The International Organization for Standardization (ISO) assigns lang ...
(0600–06FF)
**
Syriac (0700–074F)
**
Arabic Supplement (0750–077F)
**
Thaana
Thaana, Tãna, Taana or Tāna ( ) is the present writing system of the Maldivian language spoken in the Maldives. Thaana has characteristics of both an abugida (diacritics, vowel-killer strokes) and a true alphabet (all vowels are w ...
(0780–07BF)
**
N'Ko
NKo (ߒߞߏ), also spelled N'Ko, is an alphabetic script devised by Solomana Kante, Solomana Kanté in 1949, as a modern writing system for the Manding languages of West Africa. The term ''NKo'', which means ''I say'' in all Manding languages, i ...
(07C0–07FF)
**
Samaritan (0800–083F)
**
Mandaic (0840–085F)
**
Syriac Supplement (0860–086F)
**
Arabic Extended-B (0870–089F)
**
Arabic Extended-A (08A0–08FF)
*
Brahmic scripts:
**
Devanagari
Devanagari ( ; in script: , , ) is an Indic script used in the Indian subcontinent. It is a left-to-right abugida (a type of segmental Writing systems#Segmental systems: alphabets, writing system), based on the ancient ''Brāhmī script, Brā ...
(0900–097F)
**
Bengali (0980–09FF)
**
Gurmukhi (0A00–0A7F)
**
Gujarati (0A80–0AFF)
**
Oriya (0B00–0B7F)
**
Tamil (0B80–0BFF)
**
Telugu (0C00–0C7F)
**
Kannada
Kannada () is a Dravidian language spoken predominantly in the state of Karnataka in southwestern India, and spoken by a minority of the population in all neighbouring states. It has 44 million native speakers, and is additionally a ...
(0C80–0CFF)
**
Malayalam
Malayalam (; , ) is a Dravidian languages, Dravidian language spoken in the Indian state of Kerala and the union territories of Lakshadweep and Puducherry (union territory), Puducherry (Mahé district) by the Malayali people. It is one of ...
(0D00–0D7F)
**
Sinhala (0D80–0DFF)
**
Thai (0E00–0E7F)
**
Lao (0E80–0EFF)
**
Tibetan (0F00–0FFF)
**
Myanmar
Myanmar, officially the Republic of the Union of Myanmar; and also referred to as Burma (the official English name until 1989), is a country in northwest Southeast Asia. It is the largest country by area in Mainland Southeast Asia and has ...
(1000–109F)
* Other alphabetic or syllabic left-to-right scripts:
**
Georgian (10A0–10FF)
**
Hangul Jamo
This is the list of Hangul ''jamo'' (Korean alphabet letters which represent consonants and vowels in Korean) including obsolete ones. This list contains Unicode code points.
In the lists below,
* code points in were added in . (1100–11FF)
**
Ethiopic (1200–137F)
**
Ethiopic Supplement (1380–139F)
**
Cherokee
The Cherokee (; , or ) people are one of the Indigenous peoples of the Southeastern Woodlands of the United States. Prior to the 18th century, they were concentrated in their homelands, in towns along river valleys of what is now southwestern ...
(13A0–13FF)
**
Unified Canadian Aboriginal Syllabics (1400–167F)
**
Ogham (1680–169F)
**
Runic
Runes are the letters in a set of related alphabets, known as runic rows, runic alphabets or futharks (also, see '' futhark'' vs ''runic alphabet''), native to the Germanic peoples. Runes were primarily used to represent a sound value (a ...
(16A0–16FF)
*
Philippine scripts:
**
Tagalog (1700–171F)
**
Hanunoo (1720–173F)
**
Buhid (1740–175F)
**
Tagbanwa (1760–177F)
*
Khmer (1780–17FF)
*
Mongolian (1800–18AF)
*
Unified Canadian Aboriginal Syllabics Extended (18B0–18FF)
*
Brahmic scripts:
**
Limbu (1900–194F)
*
Tai scripts:
**
Tai Le (1950–197F)
**
New Tai Lue (1980–19DF)
**
Khmer Symbols (19E0–19FF)
**
Buginese (1A00–1A1F)
**
Tai Tham (1A20–1AAF)
*
Combining Diacritical Marks Extended (1AB0–1AFF)
*
Indonesian scripts:
**
Balinese (1B00–1B7F)
**
Sundanese (1B80–1BBF)
**
Batak
Batak is a collective term used to identify a number of closely related Austronesian peoples, Austronesian ethnic groups predominantly found in North Sumatra, Indonesia, who speak Batak languages. The term is used to include the Karo people ( ...
(1BC0–1BFF)
*
Lepcha (1C00–1C4F)
*
Ol Chiki (1C50–1C7F)
* Other left-to-right alphabetic or syllabic supplements:
**
Cyrillic Extended-C (1C80–1C8F)
**
Georgian Extended (1C90–1CBF)
*
Sundanese Supplement (1CC0–1CCF)
*
Vedic Extensions (1CD0–1CFF)
* Other left-to-right alphabetic supplements:
**
Phonetic Extensions
Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the ''Oxford English Dictionary'' and American dictionaries, and Americanist and Russianist phonetic notat ...
(1D00–1D7F)
**
Phonetic Extensions Supplement
Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based ...
(1D80–1DBF)
**
Combining Diacritical Marks Supplement (1DC0–1DFF)
**
Latin Extended Additional (1E00–1EFF)
**
Greek Extended (1F00–1FFF)
*
Symbols
A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise different concep ...
:
**
General Punctuation (2000–206F)
**
Superscripts and Subscripts
Superscripts and Subscripts is a Unicode block containing superscript and subscript numerals, mathematical operators, and letters used in mathematics and phonetics. The use of subscripts and superscripts in Unicode allows any polynomial, chemic ...
(2070–209F)
**
Currency Symbols (20A0–20CF)
**
Combining Diacritical Marks for Symbols (20D0–20FF)
**
Letterlike Symbols (2100–214F)
**
Number Forms (2150–218F)
**
Arrows (2190–21FF)
**
Mathematical Operators (2200–22FF)
**
Miscellaneous Technical
Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF. It contains various common symbols which are related to and used in the various technical, programming language, and academic professions. For example:
* Symbol ⌂ (HTML ...
(2300–23FF)
**
Control Pictures (2400–243F)
**
Optical Character Recognition
Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
(2440–245F)
**
Enclosed Alphanumerics
Enclosed Alphanumerics is a Unicode block of Typography, typographical symbols of an alphanumeric within a circle, a bracket or other not-closed enclosure, or ending in a full stop.
It is currently fully allocated. Within the Basic Multi ...
(2460–24FF)
**
Box Drawing (2500–257F)
**
Block Elements (2580–259F)
**
Geometric Shapes
A shape is a graphics, graphical representation of an object's form or its external boundary, outline, or external Surface (mathematics), surface. It is distinct from other object properties, such as color, Surface texture, texture, or material ...
(25A0–25FF)
**
Miscellaneous Symbols (2600–26FF)
**
Dingbats (2700–27BF)
**
Miscellaneous Mathematical Symbols-A (27C0–27EF)
**
Supplemental Arrows-A (27F0–27FF)
**
Braille Patterns (2800–28FF)
**
Supplemental Arrows-B (2900–297F)
**
Miscellaneous Mathematical Symbols-B (2980–29FF)
**
Supplemental Mathematical Operators (2A00–2AFF)
**
Miscellaneous Symbols and Arrows (2B00–2BFF)
* Other left-to-right alphabetic scripts or supplements:
**
Glagolitic
The Glagolitic script ( , , ''glagolitsa'') is the oldest known Slavic alphabet. It is generally agreed that it was created in the 9th century for the purpose of translating liturgical texts into Old Church Slavonic by Saints Cyril and Methodi ...
(2C00–2C5F)
**
Latin Extended-C (2C60–2C7F)
**
Coptic (2C80–2CFF)
**
Georgian Supplement (2D00–2D2F)
* African scripts:
**
Tifinagh (2D30–2D7F)
**
Ethiopic Extended (2D80–2DDF)
* Other left-to-right alphabetic supplements:
**
Cyrillic Extended-A (2DE0–2DFF)
**
Supplemental Punctuation (2E00–2E7F)
*
CJK scripts and symbols:
**
CJK Radicals Supplement (2E80–2EFF)
**
Kangxi Radicals
The ''Kangxi'' radicals (), also known as ''Zihui'' radicals, are a set of 214 radicals that were collated in the 18th-century '' Kangxi Dictionary'' to aid categorization of Chinese characters. They are primarily sorted by stroke count. They ...
(2F00–2FDF)
**
Ideographic Description Characters (2FF0–2FFF)
**
CJK Symbols and Punctuation (3000–303F)
**
Hiragana
is a Japanese language, Japanese syllabary, part of the Japanese writing system, along with ''katakana'' as well as ''kanji''.
It is a phonetic lettering system. The word ''hiragana'' means "common" or "plain" kana (originally also "easy", ...
(3040–309F)
**
Katakana
is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script (known as rōmaji).
The word ''katakana'' means "fragmentary kana", as the katakana characters are derived fr ...
(30A0–30FF)
**
Bopomofo
Bopomofo, also called Zhuyin Fuhao ( ; ), or simply Zhuyin, is a Chinese transliteration, transliteration system for Standard Chinese and other Sinitic languages. It is the principal method of teaching Chinese Mandarin pronunciation in Taiwa ...
(3100–312F)
**
Hangul Compatibility Jamo (3130–318F)
**
Kanbun
''Kanbun'' ( 'Han Chinese, Han writing') is a system for writing Literary Chinese used in Japan from the Nara period until the 20th century. Much of Japanese literature was written in this style and it was the general writing style for offici ...
(3190–319F)
**
Bopomofo Extended (31A0–31BF)
**
CJK Strokes (31C0–31EF)
**
Katakana Phonetic Extensions (31F0–31FF)
**
Enclosed CJK Letters and Months (3200–32FF)
**
CJK Compatibility (3300–33FF)
**
CJK Unified Ideographs Extension A
__FORCETOC__
CJK Unified Ideographs Extension-A is a Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for adminis ...
(3400–4DBF)
**
Yijing Hexagram Symbols (4DC0–4DFF)
**
CJK Unified Ideographs (4E00–9FFF)
*
Yi Syllables (A000–A48F)
*
Yi Radicals (A490–A4CF)
*
Lisu (A4D0–A4FF)
* African scripts:
**
Vai (A500–A63F)
* Other left-to-right alphabetic supplements:
**
Cyrillic Extended-B (A640–A69F)
* African scripts:
**
Bamum (A6A0–A6FF)
* Other left-to-right alphabetic supplements:
**
Modifier Tone Letters (A700–A71F)
**
Latin Extended-D (A720–A7FF)
*
Brahmic scripts:
**
Syloti Nagri (A800–A82F)
**
Common Indic Number Forms (A830–A83F)
**
Phags-pa (A840–A87F)
**
Saurashtra (A880–A8DF)
**
Devanagari Extended (A8E0–A8FF)
**
Kayah Li (A900–A92F)
**
Rejang (A930–A95F)
*
Hangul Jamo Extended-A (A960–A97F)
*
Brahmic scripts:
**
Javanese (A980–A9DF)
**
Myanmar Extended-B
Myanmar Extended-B is a Unicode block containing Burmese script characters for writing Pali and Tai Laing.
History
The following Unicode-related documents record the purpose and process of defining specific characters in the Myanmar Extended-B ...
(A9E0–A9FF)
**
Cham (AA00–AA5F)
**
Myanmar Extended-A
Myanmar Extended-A is a Unicode block containing Myanmar characters for writing the Khamti Shan and Aiton languages.
Block
The block has eleven variation sequences defined for standardized variants. They use (VS01) to denote the dotted let ...
(AA60–AA7F)
**
Tai Viet (AA80–AADF)
**
Meetei Mayek Extensions (AAE0–AAFF)
*
Ethiopic Extended-A (AB00–AB2F)
*
Latin Extended-E
Latin Extended-E is a Unicode block containing Latin script characters used in German dialectology (Teuthonista), Anthropos (journal), Anthropos alphabet, Yakut scripts, Sakha and Americanist phonetic notation, Americanist usage.
Block
Histo ...
(AB30–AB6F)
*
Cherokee Supplement (AB70–ABBF)
*
Meetei Mayek (ABC0–ABFF)
*
Hangul Syllables
Hangul Syllables is a Unicode block containing precomposed Hangul syllable blocks for modern Korean. The syllables Korean language and computers#Hangul in Unicode, can be directly mapped by algorithm to sequences of two or three characters in th ...
(AC00–D7AF)
*
Hangul Jamo Extended-B (D7B0–D7FF)
*
Surrogates:
**
High Surrogates (D800–DB7F)
**
High Private Use Surrogates (DB80–DBFF)
**
Low Surrogates (DC00–DFFF)
*
Private Use Area (E000–F8FF)
*
CJK Compatibility Ideographs (F900–FAFF)
*
Alphabetic Presentation Forms
Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.
Block
History
The following Unicode-related documents record the purpose and process of defining specific characters in ...
(FB00–FB4F)
*
Arabic Presentation Forms-A (FB50–FDFF)
*
Variation Selectors
Variation Selectors is a Unicode block containing 16 variation selectors used to specify a Variant form (Unicode), glyph variant for a preceding character. They are currently used to specify standardized variation sequences for mathematical symb ...
(FE00–FE0F)
*
Vertical Forms (FE10–FE1F)
*
Combining Half Marks (FE20–FE2F)
*
CJK Compatibility Forms (FE30–FE4F)
*
Small Form Variants (FE50–FE6F)
*
Arabic Presentation Forms-B (FE70–FEFF)
*
Halfwidth and Fullwidth Forms
In CJK characters, CJK (Chinese, Japanese, and Korean) computing, graphic characters are traditionally classed into fullwidth and halfwidth characters. Unlike monospaced fonts, a halfwidth character occupies half the width of a fullwidth characte ...
(FF00–FFEF)
*
Specials (FFF0–FFFF)
Supplementary Multilingual Plane

Plane 1, the Supplementary Multilingual Plane (SMP), contains historic scripts (except CJK ideographic), and symbols and notation used within certain fields. Scripts include
Linear B
Linear B is a syllabary, syllabic script that was used for writing in Mycenaean Greek, the earliest Attested language, attested form of the Greek language. The script predates the Greek alphabet by several centuries, the earliest known examp ...
,
Egyptian hieroglyphs
Ancient Egyptian hieroglyphs ( ) were the formal writing system used in Ancient Egypt for writing the Egyptian language. Hieroglyphs combined Ideogram, ideographic, logographic, syllabic and alphabetic elements, with more than 1,000 distinct char ...
, and
cuneiform
Cuneiform is a Logogram, logo-Syllabary, syllabic writing system that was used to write several languages of the Ancient Near East. The script was in active use from the early Bronze Age until the beginning of the Common Era. Cuneiform script ...
scripts. It also includes English reform orthographies like
Shavian and
Deseret, and some modern scripts like
Osage,
Warang Citi,
Adlam,
Wancho and
Toto. Symbols and notations include historic and modern
musical notation
Musical notation is any system used to visually represent music. Systems of notation generally represent the elements of a piece of music that are considered important for its performance in the context of a given musical tradition. The proce ...
;
mathematical alphanumerics; shorthands;
Emoji
An emoji ( ; plural emoji or emojis; , ) is a pictogram, logogram, ideogram, or smiley embedded in text and used in electronic messages and web pages. The primary function of modern emoji is to fill in emotional cues otherwise missing from type ...
and other pictographic sets; and game symbols for
playing card
A playing card is a piece of specially prepared card stock, heavy paper, thin cardboard, plastic-coated paper, cotton-paper blend, or thin plastic that is marked with distinguishing motifs. Often the front (face) and back of each card has a f ...
s,
mahjong, and
dominoes
Dominoes is a family of tile-based games played with gaming pieces. Each domino is a rectangular tile, usually with a line dividing its face into two square ''ends''. Each end is marked with a number of spots (also called ''Pip (counting), pips ...
.
, the SMP comprises the following 161 blocks:
*
Archaic Greek and other left-to-right scripts:
**
Linear B Syllabary (10000–1007F)
**
Linear B Ideograms (10080–100FF)
**
Aegean Numbers (10100–1013F)
**
Ancient Greek Numbers (10140–1018F)
**
Ancient Symbols (10190–101CF)
**
Phaistos Disc (101D0–101FF)
**
Lycian (10280–1029F)
**
Carian (102A0–102DF)
**
Coptic Epact Numbers (102E0–102FF)
**
Old Italic (10300–1032F)
**
Gothic (10330–1034F)
**
Old Permic (10350–1037F)
**
Ugaritic
Ugaritic () is an extinct Northwest Semitic languages, Northwest Semitic language known through the Ugaritic texts discovered by French archaeology, archaeologists in 1928 at Ugarit, including several major literary texts, notably the Baal cycl ...
(10380–1039F)
**
Old Persian
Old Persian is one of two directly attested Old Iranian languages (the other being Avestan) and is the ancestor of Middle Persian (the language of the Sasanian Empire). Like other Old Iranian languages, it was known to its native speakers as (I ...
(103A0–103DF)
**
Deseret (10400–1044F)
**
Shavian (10450–1047F)
**
Osmanya (10480–104AF)
**
Osage (104B0–104FF)
**
Elbasan (10500–1052F)
**
Caucasian Albanian (10530–1056F)
**
Vithkuqi (10570–105BF)
**
Todhri (105C0–105FF)
**
Linear A (10600–1077F)
**
Latin Extended-F (10780–107BF)
* Right-to-left scripts:
**
Cypriot Syllabary
The Cypriot or Cypriote syllabary (also Classical Cypriot Syllabary) is a syllabary, syllabic script used in Iron Age Cyprus, from about the 11th to the 4th centuries BCE, when it was replaced by the Greek alphabet. It has been suggested that t ...
(10800–1083F)
**
Imperial Aramaic
Imperial Aramaic is a linguistic term, coined by modern Aramaic studies, scholars in order to designate a specific historical Variety (linguistics), variety of Aramaic language. The term is polysemic, with two distinctive meanings, wider (socioli ...
(10840–1085F)
**
Palmyrene (10860–1087F)
**
Nabataean (10880–108AF)
**
Hatran (108E0–108FF)
**
Phoenician (10900–1091F)
**
Lydian (10920–1093F)
**
Meroitic Hieroglyphs (10980–1099F)
**
Meroitic Cursive (109A0–109FF)
**
Kharoshthi (10A00–10A5F)
**
Old South Arabian (10A60–10A7F)
**
Old North Arabian (10A80–10A9F)
**
Manichaean (10AC0–10AFF)
**
Avestan
Avestan ( ) is the liturgical language of Zoroastrianism. It belongs to the Iranian languages, Iranian branch of the Indo-European languages, Indo-European language family and was First language, originally spoken during the Avestan period, Old ...
(10B00–10B3F)
**
Inscriptional Parthian (10B40–10B5F)
**
Inscriptional Pahlavi (10B60–10B7F)
**
Psalter Pahlavi (10B80–10BAF)
**
Old Turkic
Old Siberian Turkic, generally known as East Old Turkic and often shortened to Old Turkic, was a Siberian Turkic language spoken around East Turkistan and Mongolia. It was first discovered in inscriptions originating from the Second Turkic Kh ...
(10C00–10C4F)
**
Old Hungarian (10C80–10CFF)
**
Hanifi Rohingya (10D00–10D3F)
**
Garay (10D40–10D8F)
**
Rumi Numeral Symbols (10E60–10E7F)
**
Yezidi (10E80–10EBF)
**
Arabic Extended-C (10EC0–10EFF)
**
Old Sogdian (10F00–10F2F)
**
Sogdian (10F30–10F6F)
**
Old Uyghur (10F70–10FAF)
**
Chorasmian (10FB0–10FDF)
**
Elymaic (10FE0–10FFF)
*
Brahmic scripts:
**
Brahmi (11000–1107F)
**
Kaithi
Kaithi (), also called Kayathi (), Kayasthi (), or Kayastani, is a Brahmic script historically used across parts of Northern and Eastern India. It was prevalent in regions corresponding to modern-day Uttar Pradesh, Bihar, and Jharkhand. The s ...
(11080–110CF)
**
Sora Sompeng (110D0–110FF)
**
Chakma (11100–1114F)
**
Mahajani (11150–1117F)
**
Sharada (11180–111DF)
**
Sinhala Archaic Numbers (111E0–111FF)
**
Khojki (11200–1124F)
**
Multani (11280–112AF)
**
Khudawadi (112B0–112FF)
**
Grantha (11300–1137F)
**
Tulu-Tigalari (11380–113FF)
**
Newa (11400–1147F)
**
Tirhuta (11480–114DF)
**
Siddham (11580–115FF)
**
Modi (11600–1165F)
**
Mongolian Supplement (11660–1167F)
**
Takri (11680–116CF)
**
Myanmar Extended-C (116D0–116FF)
**
Ahom (11700–1174F)
**
Dogra
__NOTOC__
Dogras, or Dogra people, are an Indo-Aryan ethnic community of Pakistan and India.
Dogra, Dogras or Dogri may also refer to:
* Dogra dynasty, a Hindu dynasty of Kashmir
* Dogri language, a language spoken by Dogras and other ethnic commu ...
(11800–1184F)
**
Warang Citi (118A0–118FF)
**
Dives Akuru (11900–1195F)
**
Nandinagari (119A0–119FF)
**
Zanabazar Square (11A00–11A4F)
**
Soyombo (11A50–11AAF)
*
Unified Canadian Aboriginal Syllabics Extended-A (11AB0–11ABF)
* Brahmic scripts:
**
Pau Cin Hau (11AC0–11AFF)
**
Devanagari Extended-A (11B00–11B5F)
**
Sunuwar (11BC0–11BFF)
**
Bhaiksuki (11C00–11C6F)
**
Marchen (11C70–11CBF)
**
Masaram Gondi (11D00–11D5F)
**
Gunjala Gondi (11D60–11DAF)
**
Makasar (11EE0–11EFF)
**
Kawi (11F00–11F5F)
*
Lisu Supplement (11FB0–11FBF)
*
Tamil Supplement (11FC0–11FFF)
* Cuneiform scripts:
**
Cuneiform
Cuneiform is a Logogram, logo-Syllabary, syllabic writing system that was used to write several languages of the Ancient Near East. The script was in active use from the early Bronze Age until the beginning of the Common Era. Cuneiform script ...
(12000–123FF)
**
Cuneiform Numbers and Punctuation (12400–1247F)
**
Early Dynastic Cuneiform (12480–1254F)
*
Cypro-Minoan (12F90–12FFF)
* Hieroglyphic scripts:
**
Egyptian Hieroglyphs
Ancient Egyptian hieroglyphs ( ) were the formal writing system used in Ancient Egypt for writing the Egyptian language. Hieroglyphs combined Ideogram, ideographic, logographic, syllabic and alphabetic elements, with more than 1,000 distinct char ...
(13000–1342F)
**
Egyptian Hieroglyph Format Controls (13430–1345F)
**
Egyptian Hieroglyphs Extended-A (13460–143FF)
**
Anatolian Hieroglyphs
Anatolian hieroglyphs are an indigenous logographic script native to central Anatolia, consisting of some 500 signs. They were once commonly known as Hittite hieroglyphs, but the language they encode proved to be Luwian language, Luwian, not Hitt ...
(14400–1467F)
*
Gurung Khema (16100–1613F)
*
Bamum Supplement (16800–16A3F)
*
Mro (16A40–16A6F)
*
Tangsa (16A70–16ACF)
*
Bassa Vah (16AD0–16AFF)
*
Pahawh Hmong (16B00–16B8F)
*
Kirat Rai (16D40–16D7F)
*
Medefaidrin (16E40–16E9F)
*
Miao (16F00–16F9F)
* East Asian scripts:
**
Ideographic Symbols and Punctuation (16FE0–16FFF)
**
Tangut (17000–187FF)
**
Tangut Components (18800–18AFF)
**
Khitan Small Script (18B00–18CFF)
**
Tangut Supplement (18D00–18D7F)
**
Kana Extended-B (1AFF0–1AFFF)
**
Kana Supplement (1B000–1B0FF)
**
Kana Extended-A (1B100–1B12F)
**
Small Kana Extension (1B130–1B16F)
**
Nushu (1B170–1B2FF)
* Notational writing systems:
**
Duployan (1BC00–1BC9F)
**
Shorthand Format Controls (1BCA0–1BCAF)
*
Symbols for Legacy Computing Supplement (1CC00–1CEBF)
*
Symbols
A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise different concep ...
and numerals:
**
Musical notation
Musical notation is any system used to visually represent music. Systems of notation generally represent the elements of a piece of music that are considered important for its performance in the context of a given musical tradition. The proce ...
:
***
Znamenny Musical Notation (1CF00–1CFCF)
***
Byzantine Musical Symbols (1D000–1D0FF)
***
Musical Symbols (1D100–1D1FF)
***
Ancient Greek Musical Notation (1D200–1D24F)
**
Kaktovik Numerals (1D2C0–1D2DF)
**
Mayan Numerals (1D2E0–1D2FF)
**
Mathematical symbols
A mathematical symbol is a figure or a combination of figures that is used to represent a mathematical object, an action on mathematical objects, a relation between mathematical objects, or for structuring the other symbols that occur in a mathemat ...
:
***
Tai Xuan Jing Symbols (1D300–1D35F)
***
Counting Rod Numerals (1D360–1D37F)
***
Mathematical Alphanumeric Symbols
Mathematical Alphanumeric Symbols is a Unicode block comprising styled forms of Latin alphabet, Latin and Greek alphabet, Greek letters and decimal numerical digit, digits that enable mathematicians to denote different notions with different l ...
(1D400–1D7FF)
* Notational writing systems:
**
Sutton SignWriting (1D800–1DAAF)
* Other left-to-right scripts:
**
Latin Extended-G (1DF00–1DFFF)
**
Glagolitic Supplement (1E000–1E02F)
**
Cyrillic Extended-D (1E030–1E08F)
*
Nyiakeng Puachue Hmong (1E100–1E14F)
*
Toto (1E290–1E2BF)
*
Wancho (1E2C0–1E2FF)
*
Nag Mundari (1E4D0–1E4FF)
*
Ol Onal (1E5D0–1E5FF)
* African scripts:
**
Ethiopic Extended-B (1E7E0–1E7FF)
**
Mende Kikakui (1E800–1E8DF)
**
Adlam (1E900–1E95F)
*
Symbols
A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise different concep ...
and numerals:
**
Indic Siyaq Numbers (1EC70–1ECBF)
**
Ottoman Siyaq Numbers (1ED00–1ED4F)
**
Arabic Mathematical Alphabetic Symbols (1EE00–1EEFF)
** Game tiles and cards:
***
Mahjong Tiles (1F000–1F02F)
***
Domino Tiles (1F030–1F09F)
***
Playing Cards (1F0A0–1F0FF)
**
Enclosed Alphanumeric Supplement (1F100–1F1FF)
**
Enclosed Ideographic Supplement (1F200–1F2FF)
**
Miscellaneous Symbols and Pictographs (1F300–1F5FF)
**
Emoticons (1F600–1F64F)
**
Ornamental Dingbats (1F650–1F67F)
**
Transport and Map Symbols (1F680–1F6FF)
**
Alchemical Symbols (1F700–1F77F)
**
Geometric Shapes Extended (1F780–1F7FF)
**
Supplemental Arrows-C (1F800–1F8FF)
**
Supplemental Symbols and Pictographs (1F900–1F9FF)
**
Chess Symbols
Chess Symbols is a Unicode block containing characters for fairy chess and related notations beyond the basic Western chess symbols (U+2654 to U+265F) in the Miscellaneous Symbols block, as well as symbols representing game pieces for xiangqi ...
(1FA00–1FA6F)
**
Symbols and Pictographs Extended-A (1FA70–1FAFF)
**
Symbols for Legacy Computing (1FB00–1FBFF)
Supplementary Ideographic Plane

Plane 2, the Supplementary Ideographic Plane (SIP), is used for CJK Ideographs, mostly
CJK Unified Ideographs, that were not included in earlier character encoding standards.
, the SIP comprises the following seven blocks:
*
CJK Unified Ideographs Extension B (20000–2A6DF)
*
CJK Unified Ideographs Extension C (2A700–2B73F)
*
CJK Unified Ideographs Extension D (2B740–2B81F)
*
CJK Unified Ideographs Extension E (2B820–2CEAF)
*
CJK Unified Ideographs Extension F (2CEB0–2EBEF)
*
CJK Unified Ideographs Extension I (2EBF0–2EE5F)
*
CJK Compatibility Ideographs Supplement (2F800–2FA1F)
Tertiary Ideographic Plane

Plane 3 is the Tertiary Ideographic Plane (TIP).
CJK Unified Ideographs Extension G was added to the TIP in Unicode 13.0, released in March 2020. It also is tentatively allocated for
Oracle Bone script
Oracle bone script is the oldest attested form of written Chinese, dating to the late 2nd millennium BC. Inscriptions were made by carving characters into oracle bones, usually either the shoulder bones of oxen or the plastrons of turtl ...
and
Small Seal Script
The small seal script is an archaic script style of written Chinese. It developed within the state of Qin during the Eastern Zhou dynasty (771–256 BC), and was then promulgated across China in order to replace script varieties used i ...
.
, the TIP comprises the following two blocks:
*
CJK Unified Ideographs Extension G (30000–3134F)
*
CJK Unified Ideographs Extension H (31350–323AF)
Unassigned planes
Planes 4 to 13 (planes to in
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
): No characters have yet been assigned, or proposed for assignment, to Planes 4 through 13.
Supplementary Special-purpose Plane

Plane 14 ( in
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
) is designated as the Supplementary Special-purpose Plane (SSP). It comprises the following two
blocks, :
*
Tags (E0000–E007F)
*
Variation Selectors Supplement
Variation Selectors Supplement is a Unicode block containing additional variation selectors beyond those found in the Variation Selectors block.
These combining characters are named ''variation selector-17'' (for U+E0100) through to ''variation ...
(E0100–E01EF) – used to indicate alternate glyphs for characters.
Private Use Area Planes
The two planes 15 and 16 (planes and in
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
) each contain a "
Private Use Area". They contain blocks named Supplementary Private Use Area-A (PUA-A) and -B (PUA-B). The Private Use Areas are available for use by parties outside ISO and Unicode (private use character encoding).
References
{{Unicode navigation
Plane