In
word processing
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consen ...
and
digital typesetting
Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or ''glyphs'' in digital systems representing ''characters'' (letters and other symbols).Dictionary.com Unabridged. Random Ho ...
, a non-breaking space, , also called NBSP, required space, hard space, or fixed space (though it is not of
fixed width
A tab stop on a typewriter is a location where the carriage movement is halted by an adjustable end stop. Tab stops are set manually, and pressing the tab key causes the carriage to go to the next tab stop. In text editors on a computer, the sam ...
), is a
space character
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
that prevents an
automatic line break at its position. In some formats, including
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
, it also prevents consecutive
whitespace character
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
s from collapsing into a single space.
Non-breaking space characters
with other widths also exist.
Uses and variations
Despite having layout and uses similar to those of
whitespace
White space or whitespace may refer to:
Technology
* Whitespace characters, characters in computing that represent horizontal or vertical space
* White spaces (radio), allocated but locally unused radio frequencies
* TV White Space Database, a mec ...
, it differs in contextual behavior.
Non-breaking behavior
Text-processing software typically assumes that an automatic line break may be inserted anywhere a space character occurs; a non-breaking space prevents this from happening (provided the software recognizes the character). For example, if the text "100 km" will not quite fit at the end of a line, the software may insert a line break between "100" and "km". An editor who finds this behavior undesirable may choose to use a non-breaking space between "100" and "km". This guarantees that the text "100 km" will not be broken: if it does not fit at the end of a line, it is moved in its entirety to the next line.
Non-collapsing behavior
A second common application of non-breaking spaces is in
plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
file formats such as
SGML
The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":
* Declarative: Markup should des ...
,
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
,
TeX
Tex may refer to:
People and fictional characters
* Tex (nickname), a list of people and fictional characters with the nickname
* Joe Tex (1933–1982), stage name of American soul singer Joseph Arrington Jr.
Entertainment
* ''Tex'', the Italian ...
and
LaTeX
Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well.
In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
, whose rendering engines are programmed to treat sequences of
whitespace characters
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
(space, newline, tab,
form feed
A page break is a marker in an electronic document that tells the document interpreter that the content which follows is part of a new page. A page break causes a form feed to be sent to the printer during spooling of the document to the printer. ...
, etc.) as if they were a single character (but this behavior can be overridden). Such "collapsing" of whitespace allows the author to neatly arrange the source text using line breaks, indentation and other forms of spacing without affecting the final typeset result.
In contrast, non-breaking spaces are not merged with neighboring whitespace characters when displayed and can therefore be used by an author to simply insert additional visible space in the resulting output without using spans styled with peculiar values of the
CSS
Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
"white-space" property. Conversely, indiscriminate use (see the recommended use in
style guide
A style guide or manual of style is a set of standards for the writing, formatting, and design of documents. It is often called a style sheet, although that term also has multiple other meanings. The standards can be applied either for gene ...
s), in addition to a normal space, gives extraneous space in the output.
Width variation
Other non-breaking variants,
defined in Unicode:
:It was introduced in Unicode 3.0 for Mongolian, to separate a suffix from the word stem without indicating a word boundary. It is also required for big
punctuation
Punctuation (or sometimes interpunction) is the use of spacing, conventional signs (called punctuation marks), and certain typographical devices as aids to the understanding and correct reading of written text, whether read silently or aloud. An ...
in
French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
** French language, which originated in France, and its various dialects and accents
** French people, a nation and ethnic group identified with Franc ...
where it is called and sometimes inaccurately referred to as "double punctuation" (before
;
,
?
,
!
,
»
,
›
and after
«
,
‹
; today often also before
:
), in
Russian
Russian(s) refers to anything related to Russia, including:
*Russians (, ''russkiye''), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries
*Rossiyane (), Russian language term for all citizens and peo ...
(before
em dash
The dash is a punctuation mark consisting of a long horizontal line. It is similar in appearance to the hyphen but is longer and sometimes higher from the baseline. The most common versions are the endash , generally longer than the hyphen b ...
€”, and in
German
German(s) may refer to:
* Germany (of or related to)
**Germania (historical use)
* Germans, citizens of Germany, people of German ancestry, or native speakers of the German language
** For citizens of Germany, see also German nationality law
**Ger ...
between multi-part abbreviations (e.g. "''z.B.''", "''d.h.''", "''v.l.n.r.''"). When used with Mongolian, its width is usually one third of the normal space; in other contexts, its width is about 70% of the normal space but may resemble that of the
thin space
In typography, a thin space is a space character whose width is usually or of an em. It is used to add a narrow space, such as between nested quotation marks or to separate glyphs that interfere with one another. It is not as narrow as the hair ...
(U+2009), at least with some fonts. Also starting from release 34 of Unicode Common Locale Data Repository (CLDR) the NNBSP is used in numbers as thousands group separator for French and Spanish locale.
:Produces a space equal to the figure (0–9) characters.
:Encoded in Unicode since version 3.2. The word-joiner does not produce any space and prohibits a line break at its position.
Example
On browsers, resizing the window will demonstrate the effect of non-breaking spaces on the texts below.
To show the non-breaking effect of the non-breaking space, the following words have been separated with non-breaking spaces:
Lorem Ipsum Dolor Sit Amet Consectetur Adipiscing Elit Sed Do Eiusmod Tempor Incididunt Ut Labore Et Dolore Magna Aliqua Ut Enim Ad Minim Veniam Quis Nostrud Exercitation Ullamco Laboris Nisi Ut Aliquip Ex Ea Commodo Consequat Duis Aute
To show the non-collapsing behavior of the non-breaking space, the following words have been separated with an increasing number of non-breaking spaces:
Lorem Ipsum Dolor Sit Amet Consectetur Adipiscing Elit Sed Do Eiusmod Tempor Incididunt Ut Labore Et Dolore Magna Aliqua Ut Enim Ad Minim
In contrast, the following words are separated with ordinary spaces:
Lorem Ipsum Dolor Sit Amet Consectetur Adipiscing Elit Sed Do Eiusmod Tempor Incididunt Ut Labore Et Dolore Magna Aliqua Ut Enim Ad Minim Veniam Quis Nostrud Exercitation Ullamco Laboris Nisi Ut Aliquip Ex Ea Commodo Consequat Duis Aute
Encodings
In Unicode, the
Byte order mark
The byte order mark (BOM) is a particular usage of the special Unicode character, , whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:
* The byte order, or endianness, of th ...
(BOM), U+FEFF, may be interpreted as a "zero width no-break space", but is a deprecated alternative to word joiner (U+2060).
Keyboard entry methods
It is rare for national or international standards on
keyboard layout
A keyboard layout is any specific physical, visual or functional arrangement of the keys, legends, or key-meaning associations (respectively) of a computer keyboard, mobile phone, or other computer-controlled typographic keyboard.
is the actua ...
s to define an input method for the non-breaking space. An exception is the Finnish multilingual keyboard, accepted as the national standard SFS 5966 in 2008. According to the SFS setting, the non-breaking space can be entered with the key combination
AltGr
AltGr (also Alt Graph) is a modifier key found on many computer keyboards (rather than a second Alt key found on US keyboards). It is primarily used to type characters that are not widely used in the territory where sold, such as foreign cur ...
+
Space
Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually consider ...
.
[. Drafts of the Finnish multilingual keyboard.]
Typically, authors of keyboard drivers and application programs (e.g.,
word processor
A word processor (WP) is a device or computer program that provides for input, editing, formatting, and output of text, often with some additional features.
Word processor (electronic device), Early word processors were stand-alone devices ded ...
s) have devised their own
keyboard shortcut
computing, a keyboard shortcut also known as hotkey is a series of one or several keys to quickly invoke a software program or perform a preprogrammed action. This action may be part of the standard functionality of the operating system or ...
s for the non-breaking space. For example:
Apart from this, applications and environments often have
methods of entering unicode entities directly via their code point, e.g. via the
Alt Numpad
On personal computers with numeric keypads that use Microsoft operating systems, such as Windows, many characters that do not have a dedicated key combination on the keyboard may nevertheless be entered using the Alt code (the Alt numpad input me ...
input method. (Non-breaking space has code point
255
decimal (
FF
hex) in
codepage 437
Code page 437 (CCSID 437) is the character set of the original IBM PC (personal computer). It is also known as CP437, OEM-US, OEM 437, PC-8, or DOS Latin US. The set includes all printable ASCII characters as well as some accented letters (diacr ...
and
codepage 850
Code page 850 (CCSID 850) (also known as CP 850, IBM 00850, OEM 850, DOS Latin 1) is a code page used under DOS and Psion's EPOC16 operating systems in Western Europe. Depending on the country setting and system configuration, code page 850 is ...
and code point
160
decimal (
A0
hex) in
codepage 1252.)
See also
*
Hyphens in computing, for information about hard and non-breaking hyphens
*
List of XML and HTML character entity references
In SGML, HTML and XML documents, the logical constructs known as ''character data'' and ''attribute values'' consist of sequences of characters, in which each character can manifest directly (representing itself), or can be represented by a series ...
*
*
*
*
*
** , for applications
** , a non-spacing break
*
*
*
Notes
References
{{DEFAULTSORT:Non-Breaking Space
Control characters
Whitespace
Unicode formatting code points