The word joiner (WJ) is a format
character
Character or Characters may refer to:
Arts, entertainment, and media Literature
* ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk
* ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
in
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
used to indicate that
word separation should not occur at a position, when using scripts such as
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walter ...
that do not use explicit
spacing
Spacing may refer to:
* ''Spacing'' (magazine), a Canadian magazine
* Spacing effect in psychology; the opposite of cramming
* The usage of spaces in typography
** Letter-spacing, the amount of space between a group of letters
** Line spacing or ...
. It is encoded since Unicode version 3.2 (released in 2002) as .
The word joiner does not produce any space and prohibits a
line break at its position. Thus, it is a nonbreaking space with zero width.
The word joiner replaces the ''zero-width no-break space'' (''ZWNBSP'', U+FEFF), as a usage of the no-break space of zero width. Character U+FEFF is intended for use as a
byte order mark
The byte order mark (BOM) is a particular usage of the special Unicode character, , whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:
* The byte order, or endianness, of t ...
(BOM) at the start of a file. However, if encountered elsewhere, it should, according to Unicode, be treated as a zero-width
no-break space. The deliberate use of U+FEFF for this purpose is deprecated as of Unicode 3.2, with the word joiner strongly preferred.
[FAQ - UTF-8, UTF-16, UTF-32 & BOM, ''”What should I do with U+FEFF in the middle of a file?“'']
See also
*
Byte order mark
The byte order mark (BOM) is a particular usage of the special Unicode character, , whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:
* The byte order, or endianness, of t ...
, which uses (ZWNBSP) character
*
Zero-width space
The zero-width space , abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate word boundaries to text-processing systems in scripts that do not use explicit spacing, or after characters (such as the slash) that a ...
*
Zero-width joiner
The zero-width joiner (ZWJ, ) is a non-printing character used in the computerized typesetting of writing systems in which the shape or positioning of a grapheme depends on its relation to other graphemes ( complex scripts), such as the Arabic s ...
References
Control characters
Unicode formatting code points
{{Software-eng-stub