In
writing
Writing is a medium of human communication which involves the representation of a language through a system of physically inscribed, mechanically transferred, or digitally represented symbols.
Writing systems do not themselves constitute h ...
, a space () is a blank area that
separates words,
sentences
''The Four Books of Sentences'' (''Libri Quattuor Sententiarum'') is a book of theology written by Peter Lombard in the 12th century. It is a systematic compilation of theology, written around 1150; it derives its name from the '' sententiae'' ...
,
syllables (in
syllabification
Syllabification () or syllabication (), also known as hyphenation, is the separation of a word into syllables, whether spoken, written or signed.
Overview
The written separation into syllables is usually marked by a hyphen when using English or ...
) and other written or printed
glyphs (characters). Conventions for spacing vary among languages, and in some languages the spacing rules are complex. Inter-word spaces ease the reader's task of identifying words, and avoid outright ambiguities such as "now here" vs. "nowhere". They also provide convenient guides for where a human or program may start new lines.
Typesetting
Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or '' glyphs'' in digital systems representing '' characters'' (letters and other symbols).Dictionary.com Unabridged. Random ...
can use spaces of varying widths, just as it can use graphic characters of varying widths. Unlike graphic characters, typeset spaces are
commonly stretched in order to align text. The
typewriter
A typewriter is a mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of keys, and each one causes a different single character to be produced on paper by striking an inked ribbon selectivel ...
, on the other hand, typically has only one width for all characters, including spaces. Following widespread acceptance of the typewriter, some typewriter conventions influenced
typography
Typography is the art and technique of arranging type to make written language legible, readable and appealing when displayed. The arrangement of type involves selecting typefaces, point sizes, line lengths, line-spacing ( leading), ...
and the design of printed works.
Computer representation of text facilitates getting around mechanical and physical limitations such as character widths in at least two ways:
*
Character encoding
Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ...
s such as
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
provide spaces of several widths, which are encoded using distinct numeric
code point
In character encoding terminology, a code point, codepoint or code position is a numerical value that maps to a specific character. Code points usually represent a single grapheme—usually a letter, digit, punctuation mark, or whitespace—but ...
s. For example, Unicode U+20 is the "normal" space character, but U+A0 adds the meaning that a new line should not be started there, while U+2003 represents a space with a fixed width of one
em. Collectively, such characters are called
Whitespace character
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
s.
* Formatting and drawing languages and software commonly provide much more flexibility in spacing. For example, SVG, PostScript, and countless other languages enable drawing characters at specific (x,y) coordinates on a screen or page. By drawing each word at a specific starting coordinate, such programs need not "draw" spaces at all (this can lead to difficulties in extracting the correct text back out). Similarly, word processors can "fully justify" text, stretching inter-word spaces to make all lines the same length (as can mechanical
Linotype machines). Precision is limited by physical capabilities of output devices.
Use in natural languages
Between words
Modern English uses a space to separate words, but not all languages follow this practice. Spaces were not used to separate words in
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
until roughly 600–800 AD.
Ancient Hebrew and
Arabic
Arabic (, ' ; , ' or ) is a Semitic language spoken primarily across the Arab world.Semitic languages: an international handbook / edited by Stefan Weninger; in collaboration with Geoffrey Khan, Michael P. Streck, Janet C. E.Watson; Walter ...
did use spaces partly to compensate in clarity for the
lack of vowels. The earliest Greek script also used interpuncts to divide words rather than spacing, although this practice was soon displaced by the .
Word spacing was later used by Irish and Anglo-Saxon scribes, beginning after the creation of the
Carolingian minuscule
Carolingian minuscule or Caroline minuscule is a script which developed as a calligraphic standard in the medieval European period so that the Latin alphabet of Jerome's Vulgate Bible could be easily recognized by the literate class from one reg ...
by
Alcuin of York
Alcuin of York (; la, Flaccus Albinus Alcuinus; 735 – 19 May 804) – also called Ealhwine, Alhwin, or Alchoin – was a scholar, clergyman, poet, and teacher from York, Northumbria. He was born around 735 and became the student ...
and the scribes' adoption of it. The modern space originated here and then spread to the rest of the world. Indeed, the actions of these Irish and Anglo-Saxon scribes marked the dramatic shift for reading between antiquity and the modern period. Spacing would become standard in
Renaissance
The Renaissance ( , ) , from , with the same meanings. is a period in European history
The history of Europe is traditionally divided into four time periods: prehistoric Europe (prior to about 800 BC), classical antiquity (800 BC to AD ...
Italy and France, and then
Byzantium by the end of the 16th century; then entering into the Slavic languages in
Cyrillic in the 17th century, and only in modern times entering modern
Sanskrit
Sanskrit (; attributively , ; nominally , , ) is a classical language belonging to the Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had diffused there from the northwest in the late ...
.
CJK languages do not use spaces when dealing with text containing mostly
Chinese characters
Chinese characters () are logograms developed for the writing of Chinese. In addition, they have been adapted to write other East Asian languages, and remain a key component of the Japanese writing system where they are known as ''kanji ...
and
kana
The term may refer to a number of syllabaries used to write Japanese phonological units, morae. Such syllabaries include (1) the original kana, or , which were Chinese characters (kanji) used phonetically to transcribe Japanese, the most p ...
. In
Japanese
Japanese may refer to:
* Something from or related to Japan, an island country in East Asia
* Japanese language, spoken mainly in Japan
* Japanese people, the ethnic group that identifies with Japan through ancestry or culture
** Japanese diaspor ...
, spaces may occasionally be used to separate people's
family names from
given name
A given name (also known as a forename or first name) is the part of a personal name quoted in that identifies a person, potentially with a middle name as well, and differentiates that person from the other members of a group (typically a ...
s, to denote omitted
particles
In the physical sciences, a particle (or corpuscule in older texts) is a small localized object which can be described by several physical or chemical properties, such as volume, density, or mass.
They vary greatly in size or quantity, from s ...
(especially the topic particle ''wa''), and for certain literary or artistic effects. Modern
Korean
Korean may refer to:
People and culture
* Koreans, ethnic group originating in the Korean Peninsula
* Korean cuisine
* Korean culture
* Korean language
**Korean alphabet, known as Hangul or Chosŏn'gŭl
**Korean dialects and the Jeju language
** ...
, however, has spaces as an essential part of its writing system (because of Western influence), given the phonetic nature of the
hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The le ...
script that requires word dividers to avoid ambiguity, as opposed to Chinese characters which are mostly very distinguishable from each other. In Korean, spaces are used to separate chunks of nouns, nouns and
particles
In the physical sciences, a particle (or corpuscule in older texts) is a small localized object which can be described by several physical or chemical properties, such as volume, density, or mass.
They vary greatly in size or quantity, from s ...
, adjectives, and verbs; for certain compounds or phrases, spaces may be used or not, for example the phrase for "
Republic of Korea
South Korea, officially the Republic of Korea (ROK), is a country in East Asia, constituting the southern part of the Korean Peninsula and sharing a land border with North Korea. Its western border is formed by the Yellow Sea, while its ea ...
" is usually spelled without spaces as rather than with a space as .
Runic
Runes are the letters in a set of related alphabets known as runic alphabets native to the Germanic peoples. Runes were used to write various Germanic languages (with some exceptions) before they adopted the Latin alphabet, and for specialised ...
texts use either an
interpunct-like or a
colon-like punctuation mark to separate words. There are two
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...
characters dedicated for this: and .
Between sentences
Languages with a Latin-derived alphabet have used various methods of sentence spacing since the advent of movable type in the 15th century.
* One space (some times called ''
French spacing'', ''q.v.''). This is a common convention in most countries that use the
ISO basic Latin alphabet
The ISO basic Latin alphabet is an international standard (beginning with ISO/IEC 646) for a Latin-script alphabet that consists of two sets (uppercase and lowercase) of 26 letters, codified in various national and international standards and ...
for published and final written work, as well as digital (World Wide Web) media.
Web browser
A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used o ...
s usually do not differentiate between single and multiple spaces in source code when displaying text, unless the text is given a "white-space"
CSS attribute. Without this being set, collapsing strings of spaces to a single space allow HTML source code to be spaced in a more machine-readable way, at the expense of control over the spacing of the rendered page.
* Double space (''
English spacing''). It is sometimes claimed that this convention stems from the use of the
monospaced font
A monospaced font, also called a fixed-pitch, fixed-width, or non-proportional font, is a font whose letters and characters each occupy the same amount of horizontal space. This contrasts with variable-width fonts, where the letters and spac ...
on
typewriters
A typewriter is a mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of keys, and each one causes a different single character to be produced on paper by striking an inked ribbon selectively ...
. However, instructions to use more spacing between sentences than words date back centuries, and two spaces on a typewriter was the closest approximation to typesetters' previous rules aimed at improving readability. Wider spacing continued to be used by both typesetters and typists until the
Second World War
World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the vast majority of the world's countries—including all of the great powers—forming two opposi ...
, after which typesetters gradually transitioned to word spacing between sentences in published print, while typists continued the practice of using two spaces.
* One widened space, typically one-and-a-third to slightly less than twice as wide as a word space. This spacing was sometimes used in typesetting before the 19th century. It has also been used in other non-typewriter typesetting systems such as the
Linotype machine
The Linotype machine ( ) is a "line casting" machine used in printing; manufactured and sold by the former Mergenthaler Linotype Company and related It was a hot metal typesetting system that cast lines of metal type for individual uses. Lin ...
and the
TeX
Tex may refer to:
People and fictional characters
* Tex (nickname), a list of people and fictional characters with the nickname
* Joe Tex (1933–1982), stage name of American soul singer Joseph Arrington Jr.
Entertainment
* ''Tex'', the Italian ...
system. Modern computer-based digital fonts can adjust the spacing after terminal punctuation as well, creating a
space
Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually cons ...
slightly wider than a standard word space.
There has been some
controversy regarding the proper amount of sentence spacing in typeset material. The ''Elements of Typographic Style'' states that only a single word space is required for sentence spacing. Psychological studies suggest "readers benefit from having two spaces after periods."
Unit symbols and numbers
The
International System of Units (SI) prescribes inserting a space between a number and a
unit of measurement
A unit of measurement is a definite magnitude of a quantity, defined and adopted by convention or by law, that is used as a standard for measurement of the same kind of quantity. Any other quantity of that kind can be expressed as a multi ...
(the space being regarded as an implied multiplication sign) but never between a prefix and a base unit; a space (or a
multiplication dot) should also be used between units in compound units.
[.]
: 5.0 cm, ''not'' or or
: 45 kg, ''not'' or or
: , ''not'' or
: 20 kN m or 20 kN⋅m, ''not'' or
: π/2 rad, ''not'' or
: 50 %, ''not'' or (Note: % is not an SI unit, and many
style guides do not follow this recommendation; note that is used as adjective, e.g. to express concentration as in 50% acetic acid.)
The only exception to this rule is the traditional symbolic notation of
angle
In Euclidean geometry, an angle is the figure formed by two rays, called the '' sides'' of the angle, sharing a common endpoint, called the '' vertex'' of the angle.
Angles formed by two rays lie in the plane that contains the rays. Angles a ...
s:
degree (e.g., 30°),
minute of arc (e.g., 22′), and
second of arc
A minute of arc, arcminute (arcmin), arc minute, or minute arc, denoted by the symbol , is a unit of angular measurement equal to of one degree. Since one degree is of a turn (or complete rotation), one minute of arc is of a turn. The ...
(e.g., 8″).
The SI also prescribes the use of a space (often typographically a
thin space
In typography, a thin space is a space character whose width is usually or of an em. It is used to add a narrow space, such as between nested quotation marks or to separate glyphs that interfere with one another. It is not as narrow as the hai ...
) as a
thousands separator
A decimal separator is a symbol used to separate the integer part from the fractional part of a number written in decimal form (e.g., "." in 12.45). Different countries officially designate different symbols for use as the separator. The choi ...
where required. Both the point and the comma are reserved as
decimal markers.
: 1 000 000 000 000 (thin space) or 1000000 ''not'' 1,000,000 or 1.000.000
: 1 000 000 000 000 (regular space which is significantly wider)
Sometimes a
narrow non-breaking space or
non-breaking space, respectively, is recommended (as in, for example,
IEEE Standards
The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its opera ...
and
IEC standards
This is an incomplete list of standards published by the International Electrotechnical Commission (IEC).
The numbers of older IEC standards were converted in 1997 by adding 60000; for example IEC 27 became IEC 60027.
IEC standards often have ...
) to avoid the separation of units and values or parts of compounds units, due to automatic
line wrap and word wrap
Line breaking, also known as word wrapping, is breaking a section of text into lines so that it will fit into the available width of a page, window or other display area. In text display, line wrap is continuing on a new line when a line is ful ...
.
Encoding
''Note: The above representation of a regular space is replaced with a non-breaking space for visibility.''
In
URLs, spaces are
percent encoded with its
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
/
UTF-8
UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''.
UTF-8 is capable of ...
representation
%20
.
Types of spaces
*
Figure space
A figure space or numeric space is a typographic unit equal to the size of a single numerical digit. Its size can fluctuate somewhat depending on which font is being used. This is the preferred space to use in numbers. It has the same width as a d ...
*
Non-breaking space
*
Paren space
A paren space is a blank typographic unit equal to the size of a parenthesis. Its size can fluctuate somewhat depending on which font is being used.
See also
*Em (typography)
*En (typography)
*Figure space
A figure space or numeric space is a t ...
*
Thin space
In typography, a thin space is a space character whose width is usually or of an em. It is used to add a narrow space, such as between nested quotation marks or to separate glyphs that interfere with one another. It is not as narrow as the hai ...
*
Visible space
*
*
Zero-width space
The zero-width space , abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate word boundaries to text-processing systems in scripts that do not use explicit spacing, or after characters (such as the slash) that a ...
See also
*
Em (typography)
An em (from English '' em quadrat'') is a unit in the field of typography, equal to the currently specified point size. For example, one em in a 16-point typeface is 16 points. Therefore, this unit is the same for all typefaces at a given point ...
*
En (typography)
An en (from English '' en quadrat'') is a typographic unit, half of the width of an em. By definition, it is equivalent to half of the body height of the typeface (e.g., in 16- point type it is 8 points). As its name suggests, it is also tradi ...
*
Halfwidth and fullwidth forms
In CJK (Chinese, Japanese and Korean) computing, graphic characters are traditionally classed into fullwidth (in Taiwan and Hong Kong: 全形; in CJK: 全角) and halfwidth (in Taiwan and Hong Kong: 半形; in CJK: 半角) characters. Unlik ...
*
Internal field separator
*
Sentence spacing in digital media
Sentence spacing in digital media concerns the horizontal width of the space between sentences in computer- and web-based media. ''Digital media'' allow sentence spacing variations not possible with the typewriter. Most digital fonts permit the use ...
*
Underscore
An underscore, ; also called an underline, low line, or low dash; is a line drawn under a segment of text. In proofreading, underscoring is a convention that says "set this text in italic type", traditionally used on manuscript or typescript a ...
*
Whitespace character
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
References
Further reading
*
{{DEFAULTSORT:Space (Punctuation)
Control characters
Typography
Whitespace
Writing