Complex text layout
   HOME

TheInfoList



OR:

Complex text layout (CTL) or complex text rendering is the
typesetting Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or '' glyphs'' in digital systems representing '' characters'' (letters and other symbols).Dictionary.com Unabridged. Random ...
of
writing system A writing system is a method of visually representing verbal communication, based on a script and a set of rules regulating its use. While both writing and speech are useful in conveying messages, writing differs in also being a reliable fo ...
s in which the shape or positioning of a
grapheme In linguistics, a grapheme is the smallest functional unit of a writing system. The word ''grapheme'' is derived and the suffix ''-eme'' by analogy with ''phoneme'' and other names of emic units. The study of graphemes is called '' graphemi ...
depends on its relation to other graphemes. The term is used in the field of software
internationalization In economics, internationalization or internationalisation is the process of increasing involvement of enterprises in international markets, although there is no agreed definition of internationalization. Internationalization is a crucial strateg ...
, where each grapheme is a
character Character or Characters may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
. Scripts which require CTL for proper display may be known as complex scripts. Examples include the Arabic alphabet and scripts of the
Brahmic family The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout the Indian subcontinent, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India ...
, such as
Devanagari Devanagari ( ; , , Sanskrit pronunciation: ), also called Nagari (),Kathleen Kuiper (2010), The Culture of India, New York: The Rosen Publishing Group, , page 83 is a left-to-right abugida (a type of segmental writing system), based on the ...
,
Khmer script Khmer script ( km, អក្សរខ្មែរ, )Huffman, Franklin. 1970. ''Cambodian System of Writing and Beginning Reader''. Yale University Press. . is an abugida (alphasyllabary) script used to write the Khmer language, the official la ...
or the
Thai alphabet The Thai script ( th, อักษรไทย, ) is the abugida used to write Thai, Southern Thai and many other languages spoken in Thailand. The Thai alphabet itself (as used to write Thai) has 44 consonant symbols ( th, พยัญชน ...
. Many scripts do not require CTL. For instance, the
Latin alphabet The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and the ...
or
Chinese character Chinese characters () are logograms developed for the writing of Chinese. In addition, they have been adapted to write other East Asian languages, and remain a key component of the Japanese writing system where they are known as ''kanji' ...
s can be typeset by simply displaying each character one after another in straight rows or columns. However, even these scripts have alternate forms or optional features (such as
cursive Cursive (also known as script, among other names) is any style of penmanship in which characters are written joined in a flowing manner, generally for the purpose of making writing faster, in contrast to block letters. It varies in functionali ...
writing) which require CTL to produce on computers.


Characteristics requiring CTL

The main characteristics of CTL complexity are: *
Bi-directional text A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in ea ...
, where characters may be written from either right-to-left or left-to-right direction. * Context-sensitive shaping and ligatures, where a character may change its shape, dependent on its location and/or the surrounding characters. For example, a character in
Arabic script The Arabic script is the writing system used for Arabic and several other languages of Asia and Africa. It is the second-most widely used writing system in the world by number of countries using it or a script directly derived from it, and th ...
can have as many as four different shape-forms, depending on context. * Ordering, where the displayed order of the characters is not the same as the logical order. For example, in Devanagari, which is written from left to right, the grapheme for "short i" appears to the left of ("before") the consonant that it follows: in ''ki'', the ''-i'' should render on the left, its bow reaching until above the ''k-'' to the right. Not all occurrences of these characteristics require CTL. For example, the
Greek alphabet The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BCE. It is derived from the earlier Phoenician alphabet, and was the earliest known alphabetic script to have distinct letters for vowels as ...
has context-sensitive shaping of the letter
sigma Sigma (; uppercase Σ, lowercase σ, lowercase in word-final position ς; grc-gre, σίγμα) is the eighteenth letter of the Greek alphabet. In the system of Greek numerals, it has a value of 200. In general mathematics, uppercase Σ is used a ...
, which appears as ς at the end of a word and σ elsewhere. However, these two forms are normally stored as different characters; for instance,
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
has both and , and does not treat them as equivalent. For collation and comparison purposes, software should consider the string "δῖος Ἀχιλλεύς" equivalent to "δῖοσ Ἀχιλλεύσ", but for typesetting purposes they are distinct and CTL is not required to choose the correct form.


Implementations

Most text-rendering software that is capable of CTL will include information about specific scripts, and so will be able to render them correctly without font files needing to supply instructions on how to lay out characters. Such software is usually provided in a
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vi ...
; examples include: *
Core Text Core Text is a Core Foundation style API in macOS, first introduced in Mac OS X 10.4 Tiger, made public in Mac OS X 10.5 Leopard, and introduced for the iPad with iPhone SDK 3.2. Exposing a C API, it replaces the text rendering abilities of th ...
for
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
* Uniscribe (with Universal Shaping Engine) and
DirectWrite DirectWrite is a text layout and glyph rendering API by Microsoft. It was designed to replace GDI/GDI+ and Uniscribe for screen-oriented rendering and was first shipped with Windows 7 and Windows Server 2008 R2, as well as Windows Vista and Win ...
for
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
* HarfBuzz, a
cross-platform In computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several computing platforms. Some cross-platform software ...
library * Pango, a cross-platform library which nowadays incorporates HarfBuzz However, such software is unable to properly render any script for which it lacks instructions, which can include many minority scripts. The alternative approach is to include the rendering instructions in the font file itself. Rendering software still needs to be capable of reading and following the instructions, but this is relatively simple. Examples of this latter approach include
Apple Advanced Typography Apple Advanced Typography (AAT) is Apple Inc.'s computer technology for advanced font rendering, supporting internationalization and complex features for typographers, a successor to Apple's little-used QuickDraw GX font technology of the mid- ...
(AAT) and
Graphite Graphite () is a crystalline form of the element carbon. It consists of stacked layers of graphene. Graphite occurs naturally and is the most stable form of carbon under standard conditions. Synthetic and natural graphite are consumed on la ...
. Both of these names encompass both the instruction format and the software supporting it; AAT is included on
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple trees are cultivated worldwide and are the most widely grown species in the genus '' Malus''. The tree originated in Central Asia, where its wild ancest ...
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s, while Graphite is available for
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
and
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
-based systems. The
OpenType OpenType is a format for scalable computer fonts. It was built on its predecessor TrueType, retaining TrueType's basic structure and adding many intricate data structures for prescribing typographic behavior. OpenType is a registered trademark ...
format is primarily intended for systems using the first approach (layout knowledge in the renderer, not the font), but it has a few features that assist with CTL, such as contextual ligatures. AAT and Graphite instructions can be embedded in OpenType font files.


See also

*
Typography Typography is the art and technique of arranging type to make written language legible, readable and appealing when displayed. The arrangement of type involves selecting typefaces, point sizes, line lengths, line-spacing ( leading), an ...
*
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
* Writing systems which require complex text layout: ** Arabic alphabet ** Most of the
Brahmic The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout the Indian subcontinent, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient In ...
family of scripts **
N'Ko script N'Ko () is a script devised by Solomana Kante in 1949, as a modern writing system for the Mandé languages of West Africa. The term ''N'Ko'', which means ''I say'' in all Mandé languages, is also used for the Mandé literary standard writt ...
**
Tengwar The Tengwar script is an artificial script, one of several scripts created by J. R. R. Tolkien, the author of ''The Lord of the Rings''. Within the fictional context of Middle-earth, the Tengwar were invented by the Elf Fëanor, and use ...
(diacritics and numbers)


References

{{Reflist


External links


Examples of complex rendering
SIL international SIL International (formerly known as the Summer Institute of Linguistics) is an evangelical Christian non-profit organization whose main purpose is to study, develop and document languages, especially those that are lesser-known, in order to e ...
's examples of complex writing systems around the world
Complex Text Layout
The Open Group's Desktop Technologies
Supporting Indic Scripts in Mozilla
— also other CTL scripts
Project SILA
Graphite Graphite () is a crystalline form of the element carbon. It consists of stacked layers of graphene. Graphite occurs naturally and is the most stable form of carbon under standard conditions. Synthetic and natural graphite are consumed on la ...
and
Mozilla Mozilla (stylized as moz://a) is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, spreads and supports Mozilla products, thereby promoting exclusively free software and open standards, ...
integration project
CTL Architecture in Solaris
— Solaris Globalization Whitepapers
Complex Scripts
— Microsoft Global Development and Computing Portal
Theppitak's Homepage
— information about Thai language processing
HarfBuzz's page
at Freedesktop.org
D-Type Unicode Text Module — Portable software library for complex text

BidiRenderer
— An application that illustrates the shaping and layout of complex text in bidirectional paragraphs using FriBidi, FreeType, and HarfBuzz
Tehreer-Android
— A library that gives full control over text related technologies such as bidirectional algorithm, open type shaping, text typesetting and text rendering
Tehreer-Cocoa
— Standalone font/text engine for iOS Typesetting Indic computing Natural language and computing