An underscore, ; also called an underline, low line, or low dash; is a line drawn under a segment of text. In
proofreading
Proofreading is the reading of a galley proof or an electronic copy of a publication to find and correct reproduction errors of text or art. Proofreading is the final step in the editorial cycle before publication.
Professional
Traditional ...
, underscoring is a convention that says "set this text in
italic type
In typography, italic type is a cursive font based on a stylised form of calligraphic handwriting. Owing to the influence from calligraphy, italics normally slant slightly to the right. Italics are a way to emphasise key points in a printed tex ...
", traditionally used on
manuscript
A manuscript (abbreviated MS for singular and MSS for plural) was, traditionally, any document written by hand – or, once practical typewriters became available, typewritten – as opposed to mechanically printing, printed or repr ...
or
typescript
TypeScript is a free and open source programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are ...
as an
instruction to the printer. Its use to add emphasis in modern documents is a deprecated practice.
The underscore character, , originally appeared on the
typewriter
A typewriter is a mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of keys, and each one causes a different single character to be produced on paper by striking an inked ribbon selectivel ...
and was primarily used to emphasise words as in
the proofreader's convention. To produce an underscored word, the word was typed, the
typewriter carriage
A typewriter is a mechanical or electromechanical machine for typing characters. Typically, a typewriter has an array of keys, and each one causes a different single character to be produced on paper by striking an inked ribbon selectively ...
was moved back to the beginning of the word, and the word was
overtyped with the underscore character.
In modern usage, underscoring is achieved by
markup or with
combining character
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks (including combining accents).
Unicode also ...
s. The original free-standing underscore character continues in use to create visual spacing within a sequence of characters, where a
whitespace character
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
is not permitted (e.g., in computer
filename
A filename or file name is a name used to uniquely identify a computer file in a directory structure. Different file systems impose different restrictions on filename lengths.
A filename may (depending on the file system) include:
* name &ndas ...
s,
email address
An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Engineer ...
es, and in Internet
URLs). In contexts where no formatting is supported such as in
instant messaging
Instant messaging (IM) technology is a type of online chat allowing real-time text transmission over the Internet or another computer network. Messages are typically transmitted between two or more parties, when each user inputs text and trigge ...
, or older
email
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" meant ...
formats, the 'enclosing underscores' markup is sometimes used as a proxy for underlining the word(s) enclosed ( for ).
In some languages, the mark is used as
combining diacritic
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks (including combining accents).
Unicode als ...
and is called a "combining low line".
Diacritic
The underscore is used as a
diacritic
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
mark, "combining low line", , in some
languages of Egypt
Language is a structured system of communication. The structure of a language is its grammar and the free components are its vocabulary. Languages are the primary means by which humans communicate, and may be conveyed through a variety of met ...
, some languages using the Rapidolangue orthography in
Gabon
Gabon (; ; snq, Ngabu), officially the Gabonese Republic (french: République gabonaise), is a country on the west coast of Central Africa. Located on the equator, it is bordered by Equatorial Guinea to the northwest, Cameroon to the north ...
,
Izere in Nigeria, and
indigenous languages of the Americas
Over a thousand indigenous languages are spoken by the Indigenous peoples of the Americas. These languages cannot all be demonstrated to be related to each other and are classified into a hundred or so language families (including a large numbe ...
such as
Shoshoni
The Shoshone or Shoshoni ( or ) are a Native American tribe with four large cultural/linguistic divisions:
* Eastern Shoshone: Wyoming
* Northern Shoshone: southern Idaho
* Western Shoshone: Nevada, northern Utah
* Goshute: western Utah, easter ...
and
Kiowa
Kiowa () people are a Native American tribe and an indigenous people of the Great Plains of the United States. They migrated southward from western Montana into the Rocky Mountains in Colorado in the 17th and 18th centuries,Pritzker 326 and eve ...
.
The combining diacritic, , (Macron below) is similar to the combining low line but its mark is shorter. The difference between "macron below" and "low line" is that the latter results in an unbroken underline when it is run together: compare and (only the latter should look like
abc).
Modern use
In printed documents underlining is generally avoided, with
italics
In typography, italic type is a cursive font based on a stylised form of calligraphic handwriting. Owing to the influence from calligraphy, italics normally slant slightly to the right. Italics are a way to emphasise key points in a printed tex ...
or
small caps
In typography, small caps (short for "small capitals") are characters typeset with glyphs that resemble uppercase letters (capitals) but reduced in height and weight close to the surrounding lowercase letters or text figures. This is technicall ...
often used instead, or (especially in headings) using
capitalization
Capitalization (American English) or capitalisation (British English) is writing a word with its first letter as a capital letter (uppercase letter) and the remaining letters in lower case, in writing systems with a case distinction. The term a ...
,
bold type
In typography, emphasis is the strengthening of words in a text with a font in a different style from the rest of the text, to highlight them. It is the equivalent of prosody stress in speech.
Methods and use
The most common methods in W ...
or greater
body height (font size).
In a
manuscript
A manuscript (abbreviated MS for singular and MSS for plural) was, traditionally, any document written by hand – or, once practical typewriters became available, typewritten – as opposed to mechanically printing, printed or repr ...
to be
typeset
Typesetting is the composition of text by means of arranging physical ''type'' (or ''sort'') in mechanical systems or ''glyphs'' in digital systems representing '' characters'' (letters and other symbols).Dictionary.com Unabridged. Random H ...
, various forms of underlining (see
below) were therefore conventionally used to indicate that text should be set in special type such as
italics
In typography, italic type is a cursive font based on a stylised form of calligraphic handwriting. Owing to the influence from calligraphy, italics normally slant slightly to the right. Italics are a way to emphasise key points in a printed tex ...
, part of a procedure known as
markup.
A series of underscores (like __________ ) may be used to create a blank to be filled in by hand on a paper form. It is also sometimes used to create a horizontal line; other symbols with similar
grapheme
In linguistics, a grapheme is the smallest functional unit of a writing system.
The word ''grapheme'' is derived and the suffix ''-eme'' by analogy with ''phoneme'' and other names of emic units. The study of graphemes is called ''graphemics' ...
s, such as hyphens and dashes, are also used for this purpose.
Usage in computing
In
web browser
A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
s, default settings typically distinguish
hyperlink
In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text wit ...
s by underlining them (and usually changing their color), but both users and websites can change the settings to make some or all hyperlinks appear differently (or even without distinction from normal text).
History
As early output devices (
Teleprinter
A teleprinter (teletypewriter, teletype or TTY) is an electromechanical device that can be used to send and receive typed messages through various communications channels, in both point-to-point and point-to-multipoint configurations. Initia ...
s,
CRTs and
line printer
A line printer prints one entire line of text before advancing to another line. Most early line printers were
impact printers.
Line printers are mostly associated with unit record equipment and the early days of digital computing, but the ...
s) could not produce more than one character at a location, it was not possible to underscore text, so early encodings such as
ITA2
The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s. It was the predecessor to the International Telegraph Alphabet No. 2 (ITA2), the most common teleprinter code in use until the advent of ASCII ...
and the first versions of
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
had no underscore. IBM's
EBCDIC
Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six- ...
character-coding system, introduced in 1964, added the underscore, which IBM referred to as the "break character". IBM's report on NPL (the early name of what is now called
PL/I
PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
) leaves the character set undefined, but specifically mentions the break character, and gives as an example identifier. By 1967 the underscore had spread to
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
, replacing the similarly shaped left-arrow character, (see also:
PIP).
C, developed at Bell Labs in the early 1970s, allowed the underscore in identifiers.
Underscore predates the existence of lower-case letters in many systems, so often it had to be used to make multi-word identifiers, since
CamelCase
Camel case (sometimes stylized as camelCase or CamelCase, also known as camel caps or more formally as medial capitals) is the practice of writing phrases without spaces or punctuation. The format indicates the separation of words with a single ...
(see below) was not available.
Programming conventions
Underscores inserted between letters are very common to make a "multi-word" identifier in languages that cannot handle
spaces in identifiers. This convention is known as "
snake case
Snake case (stylized as snake_case) refers to the style of writing in which each space is replaced with an underscore (_) character, and the first letter of each word is written in lowercase. It is a commonly used naming convention in computing, ...
" (the other popular method is called
camelCase
Camel case (sometimes stylized as camelCase or CamelCase, also known as camel caps or more formally as medial capitals) is the practice of writing phrases without spaces or punctuation. The format indicates the separation of words with a single ...
, where capital letters are used to show where the words start).
An underscore as the first character in an
ID is often used to indicate an internal implementation that is not considered part of the
API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
and should not be called by code outside that implementation. In
Dart
Dart or DART may refer to:
* Dart, the equipment in the game of darts
Arts, entertainment and media
* Dart (comics), an Image Comics superhero
* Dart, a character from ''G.I. Joe''
* Dart, a ''Thomas & Friends'' railway engine character
* Dar ...
, all private properties of classes must start with an underscore; this usage is also common in other languages such as
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
even though those provide keywords to indicate that members are private. It is extensively used to hide variables and functions used for implementations in
header file
Many programming languages and other computer files have a directive, often called include (sometimes copy or import), that causes the contents of the specified file to be inserted into the original file. These included files are called copybooks ...
s. In fact, the use of a single underscore for this became so common that C compilers had to standardize on a ''double'' leading underscore (for instance
__DATE__
) for actual built-in variables to avoid conflicts with the ones in header files.
PHP
PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group ...
"reserves all function names starting with __ as magical."
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
uses names that both start and end with double underscores for magic members used for purposes such as operator overloading and reflection, and names starting but not ending with a double underscore to denote private
member variable
In object-oriented programming, a member variable (sometimes called a member field) is a variable that is associated with a specific object, and accessible for all its methods (''member functions'').
In class-based programming languages, these are ...
s of classes which should be
mangled in a manner which prevents them from colliding with members of derived classes unless the classes have the same name ( in class will be mangled to ). By convention, members starting with a single underscore are considered private or protected, although this behavior only has inherent effect for modules, where statements by default import all names that do not start with an underscore, unless an export list is explicitly defined by the module.
A variable named with just an underscore often has special meaning.
$_
or
_
is the previous command or result in many
interactive shells, such as those of
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
,
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
, and
Perl
Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
. In
Perl
Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
,
@_
is a special
array
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
variable that holds the
argument
An argument is a statement or group of statements called premises intended to determine the degree of truth or acceptability of another statement called conclusion. Arguments can be studied from three main perspectives: the logical, the dialectic ...
s to a
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-oriente ...
. In
Clojure
Clojure (, like ''closure'') is a dynamic and functional dialect of the Lisp programming language on the Java platform. Like other Lisp dialects, Clojure treats code as data and has a Lisp macro system. The current development process is comm ...
, it indicates an argument whose value will be ignored.
In some languages with
pattern matching
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be ...
, such as
Prolog
Prolog is a logic programming language associated with artificial intelligence and computational linguistics.
Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is intended primarily ...
,
Standard ML
Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of the ...
,
Scala,
OCaml
OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
,
Haskell
Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
,
Erlang, and the
Wolfram Language
The Wolfram Language ( ) is a general multi-paradigm programming language developed by Wolfram Research. It emphasizes symbolic computation, functional programming, and rule-based programming and can employ arbitrary structures and data. It is ...
, the pattern
_
matches any value, but does not perform
binding.
HTML <u> and CSS
The ASCII underscore character can be inserted with the
entities
An entity is something that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not. It need not be of material existence. In particular, abstractions and legal fictions are usually ...
or (or or ).
HTML has a
presentational element
that was originally used to underline text; this usage was
deprecated
In several fields, especially computing, deprecation is the discouragement of use of some terminology, feature, design, or practice, typically because it has been superseded or is no longer considered efficient or safe, without completely removing ...
in HTML4 in favor of the
CSS
Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
style
.
In HTML5, the tag reappeared but its meaning was changed significantly: it now "represents a span of inline text which should be rendered in a way that indicates that it has a non-textual annotation".
This facility is intended for example to provide a red wavy line underline to flag spelling errors at input time but which are not to be embedded in any stored file (unlike an emphasis mark, which would be).
The elements may also exist in other
markup language
Markup language refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts. Markup is often used to control the display of the document ...
s, such as
MediaWiki
MediaWiki is a free and open-source wiki software. It is used on Wikipedia and almost all other Wikimedia websites, including Wiktionary, Wikimedia Commons and Wikidata; these sites define a large part of the requirement set for MediaWiki ...
. The
Text Encoding Initiative
The Text Encoding Initiative (TEI) is a text-centric community of practice in the academic field of digital humanities, operating continuously since the 1980s. The community currently runs a mailing list, meetings and conference series, and main ...
(TEI) provides an extensive selection of related elements for marking editorial activity (insertion, deletion, correction, addition, etc.).
Unicode
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
has a free-standing underscore at U+005F, which is a legacy of the typewriter practice of underlining using backspace and overtype. Modern practice uses the
combining diacritic
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks (including combining accents).
Unicode als ...
"combining low line" at U+0332 ◌̲ that results in an underline when run together: u̲n̲d̲e̲r̲l̲i̲n̲e̲. Unicode also has the
combining macron below, a single letter diacritic.
* ''single underline:'' a̲b̲c̲d̲e̲f̲g̲h̲i̲j̲k̲l̲m̲n̲o̲p̲q̲r̲s̲t̲u̲v̲w̲x̲y̲z̲0̲1̲2̲3̲4̲5̲6̲7̲8̲9̲
* ''double underline:'' a̲̲b̲̲c̲̲d̲̲e̲̲f̲̲g̲̲h̲̲i̲̲j̲̲k̲̲l̲̲m̲̲n̲̲o̲̲p̲̲q̲̲r̲̲s̲̲t̲̲u̲̲v̲̲w̲̲x̲̲y̲̲z̲̲0̲̲1̲̲2̲̲3̲̲4̲̲5̲̲6̲̲7̲̲8̲̲9̲̲
* ''single underline capitalized:'' A̲B̲C̲D̲E̲F̲G̲H̲I̲J̲K̲L̲M̲N̲O̲P̲Q̲R̲S̲T̲U̲V̲W̲X̲Y̲Z̲
* ''double underline capitalized:'' A̲̲B̲̲C̲̲D̲̲E̲̲F̲̲G̲̲H̲̲I̲̲J̲̲K̲̲L̲̲M̲̲N̲̲O̲̲P̲̲Q̲̲R̲̲S̲̲T̲̲U̲̲V̲̲W̲̲X̲̲Y̲̲Z̲̲
"Simulated" underlines in plain-text
In
plain-text
In computing, plain text is a loose term for data (e.g. file contents) that represent only character (computing), characters of readable material but not its graphical representation nor other objects (Floating point numbers, floating-point numb ...
applications, including plain-text
e-mail
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" meant ...
s, where emphasis markup is not possible, the desired emphasis is often indicated by surrounding words with underscore characters. For example, "You must use an _emulsion_ paint on the ceiling".
Some applications will automatically
add emphasis to text manually bracketed by underscores, either by underlining or by italicizing it (e.g. may render or ''string'').
As a marker for incorrectness
Underline (typically red or wavy or both) is often used by
spell checker In software, a spell checker (or spelling checker or spell check) is a software feature that checks for misspellings in a text. Spell-checking features are often embedded in software or services, such as a word processor, email client, electronic di ...
s (and
grammar checker
A grammar checker, in computing terms, is a program, or part of a program, that attempts to verify written text for grammatical correctness. Grammar checkers are most often implemented as a feature of a larger program, such as a word processor, b ...
s) to denote misspelled or otherwise incorrect text.
Manuscripts
Depending on local conventions, the following kinds of underlines may be used inline on manuscripts to indicate the special
typeface
A typeface (or font family) is the design of lettering that can include variations in size, weight (e.g. bold), slope (e.g. italic), width (e.g. condensed), and so on. Each of these variations of the typeface is a font.
There are list of type ...
s to be used:
*single dashed underline for , 'let it stand', proof-reading mark cancelled.
*single straight underline for ''italic type''
*single wavy underline for bold type
*double straight underline for
*double underline of one straight line and one wavy line for ''bold italic''
*triple underline for FULL CAPITAL LETTERS (used among small caps or to change text already typed as lower case).
Underlines in Chinese
In
Chinese
Chinese can refer to:
* Something related to China
* Chinese people, people of Chinese nationality, citizenship, and/or ethnicity
**''Zhonghua minzu'', the supra-ethnic concept of the Chinese nation
** List of ethnic groups in China, people of ...
, the underline is a little-used punctuation mark for proper names (;
pinyin
Hanyu Pinyin (), often shortened to just pinyin, is the official romanization system for Standard Mandarin Chinese in China, and to some extent, in Singapore and Malaysia. It is often used to teach Mandarin, normally written in Chinese for ...
: zhuānmínghào; literally "
proper name mark
Modern versions of the Chinese language have two kinds of punctuation marks for indicating proper nouns – the proper name mark / proper noun mark (Simplified Chinese: 专名号; Traditional Chinese: 專名號) and the book title marks / title ma ...
", used for personal and geographic names). Its meaning is somewhat akin to capitalization in English and should never be used for emphasis even if the influence of English computing makes the latter sometimes occur. A wavy underline (;
pinyin
Hanyu Pinyin (), often shortened to just pinyin, is the official romanization system for Standard Mandarin Chinese in China, and to some extent, in Singapore and Malaysia. It is often used to teach Mandarin, normally written in Chinese for ...
: shūmínghào; literally, "book title mark") serves a similar function, but marks names of literary works instead of proper names.
In the case of two or more adjacent proper names, each individual proper name is separately underlined so there should be a slight gap between the underlining of each proper name.
See also
*
Space Character
In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...
*
Overline
An overline, overscore, or overbar, is a typographical feature of a horizontal line drawn immediately above the text. In old mathematical notation, an overline was called a '' vinculum'', a notation for grouping symbols which is expressed in m ...
*
Strikethrough
Strikethrough is a typographical presentation of words with a horizontal line through their center, resulting in . Contrary to censored or sanitized (redacted) texts, the words remain readable. This presentation signifies one of two meanings. In ...
*
Undertie
The tie is a symbol in the shape of an arc similar to a large breve, used in Greek, phonetic alphabets, and Z notation. It can be used between two characters with spacing as punctuation, non-spacing as a diacritic, or (underneath) as a proofread ...
*
Visible space
Notes
References
External links
*
{{Typography terms
Punctuation
Typography
Typographical symbols