HOME

TheInfoList



OR:

‏The right-to-left mark (RLM) is a non-printing character used in the computerized
typesetting Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or ''glyphs'' in digital systems representing ''characters'' (letters and other symbols).Dictionary.com Unabridged. Random Ho ...
of
bi-directional text A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in eac ...
containing a mix of left-to-right scripts (such as
Latin Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
and
Cyrillic , bg, кирилица , mk, кирилица , russian: кириллица , sr, ћирилица, uk, кирилиця , fam1 = Egyptian hieroglyphs , fam2 = Proto-Sinaitic , fam3 = Phoenician , fam4 = G ...
) and right-to-left scripts (such as
Arabic Arabic (, ' ; , ' or ) is a Semitic languages, Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C ...
,
Syriac Syriac may refer to: *Syriac language, an ancient dialect of Middle Aramaic *Sureth, one of the modern dialects of Syriac spoken in the Nineveh Plains region * Syriac alphabet ** Syriac (Unicode block) ** Syriac Supplement * Neo-Aramaic languages a ...
, and
Hebrew Hebrew (; ; ) is a Northwest Semitic language of the Afroasiatic language family. Historically, it is one of the spoken languages of the Israelites and their longest-surviving descendants, the Jews and Samaritans. It was largely preserved ...
). RLM is used to change the way adjacent characters are grouped with respect to text direction. However, for
Arabic script The Arabic script is the writing system used for Arabic and several other languages of Asia and Africa. It is the second-most widely used writing system in the world by number of countries using it or a script directly derived from it, and the ...
,
Arabic letter mark The Arabic letter mark (ALM) is a non-printing character used in the computerized typesetting of bi-directional text containing mixed left-to-right scripts (such as Latin and Cyrillic) and right-to-left scripts (such as Persian, Arabic, Syriac an ...
may be a better choice.


Unicode

In
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
, the RLM character is encoded at . In
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
it is E2 80 8F. Usage is prescribed in the Unicode Bidi (bidirectional) Algorithm.UNICODE 12.0 Standard, http://www.unicode.org/versions/Unicode12.0.0/UnicodeStandard-12.0.pdf, p. 880


Example of use in HTML

Suppose the writer wishes to inject a run of Arabic or Hebrew (i.e. right-to-left) text into an English paragraph, with an exclamation point at the end of the run on the left hand side. "I enjoyed staying -- really! -- at his house." With the "really!" in Hebrew‏, the sentence renders as follows: I enjoyed staying -- באמת! -- at his house. (Note that in a computer's memory, the order of the Hebrew characters is ‭ב,א,מ,ת‬.) With an RLM added after the exclamation mark, it renders as follows: I enjoyed staying -- באמת!‏ -- at his house. (Standards-compliant browsers will render the exclamation mark on the right in the first example, and on the left in the second.) This happens because the browser recognizes that the paragraph is in a LTR script (
Latin Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
), and applies punctuation, which is neutral as to its direction, in coordination with the surrounding (left-to-right) text. The RLM causes the punctuation to be surrounded by only RTL text—the Hebrew and the RLM—and hence be positioned as if it were in right-to-left text, i.e., to the left of the preceding text.


See also

*
Arabic letter mark The Arabic letter mark (ALM) is a non-printing character used in the computerized typesetting of bi-directional text containing mixed left-to-right scripts (such as Latin and Cyrillic) and right-to-left scripts (such as Persian, Arabic, Syriac an ...
*
Left-to-right mark The left-to-right mark (LRM) is a control character (an invisible formatting character) used in computerized typesetting (including word processing in a program like Microsoft Word) of text containing a mix of left-to-right scripts (such as Latin ...
*
Bidirectional text A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text direction in eac ...


References


External links


Unicode standard annex #9: The bidirectional algorithm
*
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
Control characters Digital typography Unicode formatting code points {{Digital-typography-stub