HOME

TheInfoList



OR:

In word processing and
digital typesetting Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or ''glyphs'' in digital systems representing ''characters'' (letters and other symbols).Dictionary.com Unabridged. Random Ho ...
, a non-breaking space, , also called NBSP, required space, hard space, or fixed space (though it is not of
fixed width A tab stop on a typewriter is a location where the carriage movement is halted by an adjustable end stop. Tab stops are set manually, and pressing the tab key causes the carriage to go to the next tab stop. In text editors on a computer, the sam ...
), is a space character that prevents an automatic line break at its position. In some formats, including HTML, it also prevents consecutive whitespace characters from collapsing into a single space. Non-breaking space characters with other widths also exist.


Uses and variations

Despite having layout and uses similar to those of
whitespace White space or whitespace may refer to: Technology * Whitespace characters, characters in computing that represent horizontal or vertical space * White spaces (radio), allocated but locally unused radio frequencies * TV White Space Database, a mec ...
, it differs in contextual behavior.


Non-breaking behavior

Text-processing software typically assumes that an automatic line break may be inserted anywhere a space character occurs; a non-breaking space prevents this from happening (provided the software recognizes the character). For example, if the text "100 km" will not quite fit at the end of a line, the software may insert a line break between "100" and "km". An editor who finds this behavior undesirable may choose to use a non-breaking space between "100" and "km". This guarantees that the text "100 km" will not be broken: if it does not fit at the end of a line, it is moved in its entirety to the next line.


Non-collapsing behavior

A second common application of non-breaking spaces is in plain text file formats such as
SGML The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates": * Declarative: Markup should des ...
, HTML, TeX and LaTeX, whose rendering engines are programmed to treat sequences of whitespace characters (space, newline, tab, form feed, etc.) as if they were a single character (but this behavior can be overridden). Such "collapsing" of whitespace allows the author to neatly arrange the source text using line breaks, indentation and other forms of spacing without affecting the final typeset result. In contrast, non-breaking spaces are not merged with neighboring whitespace characters when displayed and can therefore be used by an author to simply insert additional visible space in the resulting output without using spans styled with peculiar values of the
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
"white-space" property. Conversely, indiscriminate use (see the recommended use in
style guide A style guide or manual of style is a set of standards for the writing, formatting, and design of documents. It is often called a style sheet, although that term also has multiple other meanings. The standards can be applied either for gene ...
s), in addition to a normal space, gives extraneous space in the output.


Width variation

Other non-breaking variants, defined in Unicode: :It was introduced in Unicode 3.0 for Mongolian, to separate a suffix from the word stem without indicating a word boundary. It is also required for big punctuation in
French French (french: français(e), link=no) may refer to: * Something of, from, or related to France ** French language, which originated in France, and its various dialects and accents ** French people, a nation and ethnic group identified with Franc ...
where it is called and sometimes inaccurately referred to as "double punctuation" (before ;, ?, !, », › and after «, ‹; today often also before :), in Russian (before
em dash The dash is a punctuation mark consisting of a long horizontal line. It is similar in appearance to the hyphen but is longer and sometimes higher from the baseline. The most common versions are the endash , generally longer than the hyphen b ...
€”, and in German between multi-part abbreviations (e.g. "''z.B.''", "''d.h.''", "''v.l.n.r.''"). When used with Mongolian, its width is usually one third of the normal space; in other contexts, its width is about 70% of the normal space but may resemble that of the thin space (U+2009), at least with some fonts. Also starting from release 34 of Unicode Common Locale Data Repository (CLDR) the NNBSP is used in numbers as thousands group separator for French and Spanish locale. :Produces a space equal to the figure (0–9) characters. :Encoded in Unicode since version 3.2. The word-joiner does not produce any space and prohibits a line break at its position.


Example

On browsers, resizing the window will demonstrate the effect of non-breaking spaces on the texts below. To show the non-breaking effect of the non-breaking space, the following words have been separated with non-breaking spaces:
Lorem Ipsum Dolor Sit Amet Consectetur Adipiscing Elit Sed Do Eiusmod Tempor Incididunt Ut Labore Et Dolore Magna Aliqua Ut Enim Ad Minim Veniam Quis Nostrud Exercitation Ullamco Laboris Nisi Ut Aliquip Ex Ea Commodo Consequat Duis Aute
To show the non-collapsing behavior of the non-breaking space, the following words have been separated with an increasing number of non-breaking spaces:
Lorem Ipsum  Dolor   Sit    Amet     Consectetur      Adipiscing       Elit        Sed         Do          Eiusmod           Tempor            Incididunt             Ut              Labore               Et                Dolore                 Magna                  Aliqua                   Ut                    Enim                     Ad                      Minim
In contrast, the following words are separated with ordinary spaces:
Lorem Ipsum Dolor Sit Amet Consectetur Adipiscing Elit Sed Do Eiusmod Tempor Incididunt Ut Labore Et Dolore Magna Aliqua Ut Enim Ad Minim Veniam Quis Nostrud Exercitation Ullamco Laboris Nisi Ut Aliquip Ex Ea Commodo Consequat Duis Aute


Encodings

In Unicode, the Byte order mark (BOM), U+FEFF, may be interpreted as a "zero width no-break space", but is a deprecated alternative to word joiner (U+2060).


Keyboard entry methods

It is rare for national or international standards on keyboard layouts to define an input method for the non-breaking space. An exception is the Finnish multilingual keyboard, accepted as the national standard SFS 5966 in 2008. According to the SFS setting, the non-breaking space can be entered with the key combination AltGr + Space.. Drafts of the Finnish multilingual keyboard. Typically, authors of keyboard drivers and application programs (e.g., word processors) have devised their own keyboard shortcuts for the non-breaking space. For example: Apart from this, applications and environments often have methods of entering unicode entities directly via their code point, e.g. via the
Alt Numpad On personal computers with numeric keypads that use Microsoft operating systems, such as Windows, many characters that do not have a dedicated key combination on the keyboard may nevertheless be entered using the Alt code (the Alt numpad input me ...
input method. (Non-breaking space has code point 255 decimal (FF hex) in
codepage 437 Code page 437 (CCSID 437) is the character set of the original IBM PC (personal computer). It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacr ...
and
codepage 850 Code page 850 (CCSID 850) (also known as CP 850, IBM 00850, OEM 850, DOS Latin 1) is a code page used under DOS and Psion's EPOC16 operating systems in Western Europe. Depending on the country setting and system configuration, code page 850 is ...
and code point 160 decimal (A0 hex) in
codepage 1252 Windows-1252 or CP-1252 (code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. It i ...
.)


See also

* Hyphens in computing, for information about hard and non-breaking hyphens * List of XML and HTML character entity references * * * * * ** , for applications ** , a non-spacing break * * *


Notes


References

{{DEFAULTSORT:Non-Breaking Space Control characters Whitespace Unicode formatting code points