Zero-width Spaces
   HOME

TheInfoList



OR:

The zero-width space , abbreviated ZWSP, is a
non-printing character In computing and telecommunication, a control character or non-printing character (NPC) is a code point (a number) in a character set, that does not represent a written symbol. They are used as in-band signaling to cause effects other than the ...
used in computerized typesetting to indicate word boundaries to text-processing systems in scripts that do not use explicit spacing, or after characters (such as the slash) that are not followed by a visible space but after which there may nevertheless be a line break. It is also used with languages without visible space between words, for example, Japanese. Normally, it is not a visible separation, but it may expand in passages that are fully justified.


Usage

In HTML pages, the zero-width space can be used to mark a potential line break ''without'' hyphenation, as can the HTML element <wbr>; for hyphenated line breaks, a soft hyphen is used. The zero-width space was not supported in some older web browsers. To show the effect of the zero-width space, the following words have been separated with zero-width spaces:
And the following words are not separated with these spaces:
On browsers supporting zero-width spaces, resizing the window will re-break the first text only at word boundaries, while the second text will not be broken at all.


Prohibited in URLs

ICANN The Internet Corporation for Assigned Names and Numbers (ICANN ) is an American multistakeholder group and nonprofit organization responsible for coordinating the maintenance and procedures of several databases related to the namespaces ...
rules prohibit domain names from including non-displayed characters such as zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one.


Encoding

The zero-width space character is encoded in Unicode as , and input in HTML as , or . Contrary to what their names suggest, the character entities &NegativeThickSpace;, &NegativeMediumSpace;, &NegativeThinSpace;, and &NegativeVeryThinSpace; also refer to the zero-width space. The TeX representation is ; the LaTeX representation is \hspace; and the groff representation is \:. Its semantics and HTML implementation are similar to the soft hyphen, except that soft hyphens display a hyphen character at the point where the line is broken.


See also

*
Hair space In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
* Whitespace character – including a table comparing various space-like characters * Word divider * Word wrapping * Word joiner (U+2060: ⁠), as well as ''zero-width no-break space'' (U+FEFF: ) * Zero-width joiner (U+200D: ‍) * Zero-width non-joiner (U+200C: ‌)


References


Citations


Sources

* Unicode Consortium,
Special Areas and Format Characters
(Chapter 16), ''The Unicode Standard'', Version 5.2. * Victor H. Mair, Yongquan Liu, ''Characters and computers'', IOS Press, 1991. {{DEFAULTSORT:Zero-Width Space Control characters Typography Unicode formatting code points Whitespace