HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
and
telecommunication Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than that fe ...
, a control
character Character or Characters may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
or non-printing character (NPC) is a
code point In character encoding terminology, a code point, codepoint or code position is a numerical value that maps to a specific character. Code points usually represent a single grapheme—usually a letter, digit, punctuation mark, or whitespace—but ...
(a
number A number is a mathematical object used to count, measure, and label. The original examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual numbers c ...
) in a
character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that ...
, that does not represent a written symbol. They are used as
in-band signaling In telecommunications, in-band signaling is the sending of control information within the same band or channel used for data such as voice or video. This is in contrast to out-of-band signaling which is sent over a different channel, or even ov ...
to cause effects other than the addition of a symbol to the text. All other characters are mainly printing, printable, or
graphic character In ISO/IEC 646 (commonly known as ASCII) and related standards including ISO 8859 and Unicode, a graphic character is any character intended to be written, printed, or otherwise displayed in a form that can be read by humans. In other words, it i ...
s, except perhaps for the "
space Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually consider ...
" character (see ASCII printable characters).


History

Procedural signs in
Morse code Morse code is a method used in telecommunication to encode text characters as standardized sequences of two different signal durations, called ''dots'' and ''dashes'', or ''dits'' and ''dahs''. Morse code is named after Samuel Morse, one of ...
are a form of control character. A form of control characters were introduced in the 1870
Baudot code The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s. It was the predecessor to the International Telegraph Alphabet No. 2 (ITA2), the most common teleprinter code in use until the advent of ASCII. ...
: NUL and DEL. The 1901
Murray code The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s. It was the predecessor to the International Telegraph Alphabet No. 2 (ITA2), the most common teleprinter code in use until the advent of ASCII. ...
added the
carriage return A carriage return, sometimes known as a cartridge return and often shortened to CR, or return, is a control character or mechanism used to reset a device's position to the beginning of a line of text. It is closely associated with the line feed a ...
(CR) and
line feed Newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a ...
(LF), and other versions of the Baudot code included other control characters. The
bell character A bell code (sometimes bell character) is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming me ...
(BEL), which rang a bell to alert operators, was also an early
teletype A teleprinter (teletypewriter, teletype or TTY) is an electromechanical device that can be used to send and receive typed messages through various communications channels, in both point-to-point and point-to-multipoint configurations. Initia ...
control character. Control characters have also been called "format effectors".


In ASCII

There were quite a few control characters defined (33 in ASCII, and the ECMA-48 standard adds 32 more). This was because early terminals had very primitive mechanical or electrical controls that made any kind of state-remembering
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
quite expensive to implement, thus a different code for each and every function looked like a requirement. It quickly became possible and inexpensive to interpret sequences of codes to perform a function, and device makers found a way to send hundreds of device instructions. Specifically, they used ASCII code 27 (escape), followed by a series of characters called a "control sequence" or "escape sequence". The mechanism was invented by
Bob Bemer Robert William Bemer (February 8, 1920 – June 22, 2004) was a computer scientist best known for his work at IBM during the late 1950s and early 1960s. Early life and education Born in Sault Ste. Marie, Michigan, Bemer graduated from Cranb ...
, the father of ASCII. For example, the sequence of code 27, followed by the printable characters " ,_would_cause_a_ ,_would_cause_a_Digital_Equipment_Corporation">;10H",_would_cause_a_Digital_Equipment_Corporation_
,_would_cause_a_Digital_Equipment_Corporation">;10H",_would_cause_a_Digital_Equipment_Corporation_VT100">Digital_Equipment_Corporation.html"_;"title=";10H",_would_cause_a_Digital_Equipment_Corporation">;10H",_would_cause_a_Digital_Equipment_Corporation_VT100_terminal_to_move_its_cursor_(user_interface).html" "title="VT100.html" ;"title="Digital_Equipment_Corporation.html" ;"title=";10H", would cause a Digital Equipment Corporation">;10H", would cause a Digital Equipment Corporation VT100">Digital_Equipment_Corporation.html" ;"title=";10H", would cause a Digital Equipment Corporation">;10H", would cause a Digital Equipment Corporation VT100 terminal to move its cursor (user interface)">cursor Cursor may refer to: * Cursor (user interface), an indicator used to show the current position for user interaction on a computer monitor or other display device * Cursor (databases), a control structure that enables traversal over the records in ...
to the 10th cell of the 2nd line of the screen. Several standards exist for these sequences, notably ANSI X3.64. But the number of non-standard variations in use is large, especially among printers, where technology has advanced far faster than any standards body can possibly keep up with. All entries in the
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
table below code 32 (technically the C0 control code set) are of this kind, including CR and LF used to separate lines of text. The code 127 (
DEL Del, or nabla, is an operator used in mathematics (particularly in vector calculus) as a vector differential operator, usually represented by the nabla symbol ∇. When applied to a function defined on a one-dimensional domain, it denotes ...
) is also a control character.
Extended ASCII Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes critic ...
sets defined by
ISO 8859 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12 ...
added the codes 128 through 159 as control characters, this was primarily done so that if the high bit was stripped it would not change a printing character to a C0 control code, but there have been some assignments here, in particular NEL. This second set is called the C1 set. These 65 control codes were carried over to
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
. Unicode added more characters that could be considered controls, but it makes a distinction between these "Formatting characters" (such as the
zero-width non-joiner The zero-width non-joiner (ZWNJ) is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to ...
), and the 65 control characters. The
Extended Binary Coded Decimal Interchange Code Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six- ...
(EBCDIC) character set contains 65 control codes, including all of the ASCII control codes plus additional codes which are mostly used to control IBM peripherals. The control characters in ASCII still in common use include: * 0 (
null Null may refer to: Science, technology, and mathematics Computing * Null (SQL) (or NULL), a special marker and keyword in SQL indicating that something has no value * Null character, the zero-valued ASCII character, also designated by , often use ...
, NUL, \0, ^@), originally intended to be an ignored character, but now used by many
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s including C to mark the end of a string. * 7 (
bell A bell is a directly struck idiophone percussion instrument. Most bells have the shape of a hollow cup that when struck vibrates in a single strong strike tone, with its sides forming an efficient resonator. The strike may be made by an inter ...
, BEL, \a, ^G), which may cause the device to emit a warning such as a bell or beep sound or the screen flashing. * 8 (
backspace Backspace () is the keyboard key that originally pushed the typewriter carriage one position backwards and in modern computer systems moves the display cursor one position backwards,"Backwards" means to the left for left-to-right languages. delete ...
, BS, \b, ^H), may overprint the previous character. * 9 (
horizontal tab The tab key (abbreviation of tabulator key or tabular key) on a keyboard is used to advance the cursor to the next tab stop. History The word ''tab'' derives from the word ''tabulate'', which means "to arrange data in a tabular, or table, fo ...
, HT, \t, ^I), moves the printing position right to the next tab stop. * 10 (
line feed Newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a ...
, LF, \n, ^J), moves the print head down one line, or to the left edge and down. Used as the end of line marker in most UNIX systems and variants. * 11 (
vertical tab The tab key (abbreviation of tabulator key or tabular key) on a keyboard is used to advance the cursor to the next tab stop. History The word ''tab'' derives from the word ''tabulate'', which means "to arrange data in a tabular, or table, fo ...
, VT, \v, ^K), vertical tabulation. * 12 (
form feed A page break is a marker in an electronic document that tells the document interpreter that the content which follows is part of a new page. A page break causes a form feed to be sent to the printer during spooling of the document to the printer. ...
, FF, \f, ^L), to cause a printer to eject paper to the top of the next page, or a video terminal to clear the screen. * 13 (
carriage return A carriage return, sometimes known as a cartridge return and often shortened to CR, or return, is a control character or mechanism used to reset a device's position to the beginning of a line of text. It is closely associated with the line feed a ...
, CR, \r, ^M), moves the printing position to the start of the line, allowing overprinting. Used as the end of line marker in
Classic Mac OS Mac OS (originally System Software; retronym: Classic Mac OS) is the series of operating systems developed for the Macintosh family of personal computers by Apple Computer from 1984 to 2001, starting with System 1 and ending with Mac OS 9. The ...
,
OS-9 OS-9 is a family of real-time, process-based, multitasking, multi-user operating systems, developed in the 1980s, originally by Microware Systems Corporation for the Motorola 6809 microprocessor. It was purchased by Radisys Corp in 2001, and ...
,
FLEX Flex or FLEX may refer to: Computing * Flex (language), developed by Alan Kay * FLEX (operating system), a single-tasking operating system for the Motorola 6800 * FlexOS, an operating system developed by Digital Research * FLEX (protocol), a comm ...
(and variants). A CR+LF pair is used by
CP/M CP/M, originally standing for Control Program/Monitor and later Control Program for Microcomputers, is a mass-market operating system created in 1974 for Intel 8080/ 85-based microcomputers by Gary Kildall of Digital Research, Inc. Initial ...
-80 and its derivatives including
DOS DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems. DOS may also refer to: Computing * Data over signalling (DoS), multiplexing data onto a signalling channel * Denial-of-service attack (DoS), an attack on a communicat ...
and
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
, and by
Application Layer An application layer is an abstraction layer that specifies the shared communications protocols and Interface (computing), interface methods used by Host (network), hosts in a communications network. An ''application layer'' abstraction is speci ...
protocols Protocol may refer to: Sociology and politics * Protocol (politics), a formal agreement between nation states * Protocol (diplomacy), the etiquette of diplomacy and affairs of state * Etiquette, a code of personal behavior Science and technology ...
such as
FTP The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data ...
,
SMTP The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typical ...
, and
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
. * 26 (
Control-Z In computer data, a substitute character (␚) is a control character that is used to pad transmitted data in order to send it in blocks of fixed size, or to stand in place of a character that is recognized to be invalid, erroneous or unreprese ...
, SUB, EOF, ^Z). Acts as an end-of-file for the Windows text-mode file i/o. * 27 (
escape Escape or Escaping may refer to: Computing * Escape character, in computing and telecommunication, a character which signifies that what follows takes an alternative interpretation ** Escape sequence, a series of characters used to trigger some so ...
, ESC, \e ( GCC only), ^ /code>). Introduces an escape sequence. Control characters may be described as doing something when the user inputs them, such as code 3 (End-of-Text character, ETX, ^C) to interrupt the running process, or code 4 (End-of-Transmission character, EOT, ^D), used to end text input or to exit a Unix shell. These uses usually have little to do with their use when they are in text being output, and on modern systems usually do not involve the transmission of the code number at all (instead the program gets the fact that the user is holding down the Ctrl key and pushing the key marked with a 'C').


In Unicode

In Unicode, "Control-characters" are U+0000—U+001F (C0 controls), U+007F (delete), and U+0080—U+009F (C1 controls). Their General Category is "Cc". Formatting codes are distinct, in General Category "Cf". The Cc control characters have no Name in Unicode, but are given labels such as "" instead.


Display

There are a number of techniques to display non-printing characters, which may be illustrated with the
bell character A bell code (sometimes bell character) is a device control code originally sent to ring a small electromechanical bell on tickers and other teleprinters and teletypewriters to alert operators at the other end of the line, often of an incoming me ...
in
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
encoding: *
Code point In character encoding terminology, a code point, codepoint or code position is a numerical value that maps to a specific character. Code points usually represent a single grapheme—usually a letter, digit, punctuation mark, or whitespace—but ...
: decimal 7, hexadecimal 0x07 * An abbreviation, often three capital letters: BEL * A special character condensing the abbreviation: Unicode U+2407 (␇), "symbol for bell" * An
ISO 2047 ISO 2047 (Information processing – Graphical representations for the control characters of the 7-bit coded character set) is a standard for graphical representation of the control characters for debugging purposes, such as may be found in the ...
graphical representation: Unicode U+237E (⍾), "graphic for bell" *
Caret notation Caret notation is a notation for control characters in ASCII. The notation assigns to control-code 1, sequentially through the alphabet to assigned to control-code 26 (0x1A). For the control-codes outside of the range 1–26, the ...
in ASCII, where code point 00xxxxx is represented as a caret followed by the capital letter at code point 10xxxxx: ^G * An
escape sequence In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding (and possibly terminating) characters. Examples * In C and man ...
, as in C/
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
character string codes: , , , etc.


How control characters map to keyboards

ASCII-based
keyboards Keyboard may refer to: Text input * Keyboard, part of a typewriter * Computer keyboard ** Keyboard layout, the software control of computer keyboards and their mapping ** Keyboard technology, computer keyboard hardware and firmware Music * Musi ...
have a key labelled "
Control Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Controllin ...
", "Ctrl", or (rarely) "Cntl" which is used much like a shift key, being pressed in combination with another letter or symbol key. In one implementation, the control key generates the code 64 places below the code for the (generally) uppercase letter it is pressed in combination with (i.e., subtract 64 from ASCII code value in decimal of the (generally) uppercase letter). The other implementation is to take the ASCII code produced by the key and
bitwise AND In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic oper ...
it with 31, forcing bits 6 and 7 to zero. For example, pressing "control" and the letter "g" or "G" (code 107 in
octal The octal numeral system, or oct for short, is the radix, base-8 number system, and uses the Numerical digit, digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, ...
or 71 in
base 10 The decimal numeral system (also called the base-ten positional numeral system and denary or decanary) is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numer ...
, which is 01000111 in
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that t ...
), produces the code 7 (Bell, 7 in base 10, or 00000111 in binary). The NULL character (code 0) is represented by Ctrl-@, "@" being the code immediately before "A" in the ASCII character set. For convenience, a lot of terminals accept Ctrl-Space as an alias for Ctrl-@. In either case, this produces one of the 32 ASCII control codes between 0 and 31. This approach is not able to represent the DEL character because of its value (code 127), but Ctrl-? is often used for this character, as subtracting 64 from a '?' gives −1, which if masked to 7 bits is 127. When the control key is held down, letter keys produce the same control characters regardless of the state of the shift or caps lock keys. In other words, it does not matter whether the key would have produced an upper-case or a lower-case letter. The interpretation of the control key with the space, graphics character, and digit keys (ASCII codes 32 to 63) vary between systems. Some will produce the same character code as if the control key were not held down. Other systems translate these keys into control characters when the control key is held down. The interpretation of the control key with non-ASCII ("foreign") keys also varies between systems. Control characters are often rendered into a printable form known as
caret notation Caret notation is a notation for control characters in ASCII. The notation assigns to control-code 1, sequentially through the alphabet to assigned to control-code 26 (0x1A). For the control-codes outside of the range 1–26, the ...
by printing a caret (^) and then the ASCII character that has a value of the control character plus 64. Control characters generated using letter keys are thus displayed with the upper-case form of the letter. For example, ^G represents code 7, which is generated by pressing the G key when the control key is held down. Keyboards also typically have a few single keys which produce control character codes. For example, the key labelled "Backspace" typically produces code 8, "Tab" code 9, "Enter" or "Return" code 13 (though some keyboards might produce code 10 for "Enter"). Many keyboards include keys that do not correspond to any ASCII printable or control character, for example cursor control arrows and
word processing A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consen ...
functions. The associated keypresses are communicated to computer programs by one of four methods: appropriating otherwise unused control characters; using some encoding other than ASCII; using multi-character control sequences; or using an additional mechanism outside of generating characters. "Dumb"
computer terminal A computer terminal is an electronic or electromechanical hardware device that can be used for entering data into, and transcribing data from, a computer or a computing system. The teletype was an example of an early-day hard-copy terminal and ...
s typically use control sequences. Keyboards attached to stand-alone
personal computer A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tec ...
s made in the 1980s typically use one (or both) of the first two methods. Modern computer keyboards generate
scancode A scancode (or scan code) is the data that most computer keyboards send to a computer to report which keys have been pressed. A number, or sequence of numbers, is assigned to each key on the keyboard. Variants Mapping key positions by row an ...
s that identify the specific physical keys that are pressed; computer software then determines how to handle the keys that are pressed, including any of the four methods described above.


The design purpose

The control characters were designed to fall into a few groups: printing and display control, data structuring, transmission control, and miscellaneous.


Printing and display control

Printing control characters were first used to control the physical mechanism of printers, the earliest output device. An early example of this idea was the use of Figures (FIGS) and Letters (LTRS) in
Baudot code The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s. It was the predecessor to the International Telegraph Alphabet No. 2 (ITA2), the most common teleprinter code in use until the advent of ASCII. ...
to shift between two code pages. A later, but still early, example was the
out-of-band Out-of-band activity is activity outside a defined telecommunications frequency band, or, metaphorically, outside of any primary communication channel. Protection from falsing is among its purposes. Examples General usage * Out-of-band agreement ...
ASA carriage control characters ASA control characters are simple printing command characters used to control the movement of paper through line printers. These commands are presented as special characters in the first column of each text line to be printed, and affect how the pa ...
. Later, control characters were integrated into the stream of data to be printed. The carriage return character (CR), when sent to such a device, causes it to put the character at the edge of the paper at which writing begins (it may, or may not, also move the printing position to the next line). The line feed character (LF/NL) causes the device to put the printing position on the next line. It may (or may not), depending on the device and its configuration, also move the printing position to the start of the next line (which would be the leftmost position for
left-to-right A writing system is a method of visually representing verbal communication, based on a script and a set of rules regulating its use. While both writing and speech are useful in conveying messages, writing differs in also being a reliable form ...
scripts, such as the alphabets used for Western languages, and the rightmost position for
right-to-left In a script (commonly shortened to right to left or abbreviated RTL, RL-TB or R2L), writing starts from the right of the page and continues to the left, proceeding from top to bottom for new lines. Arabic, Hebrew, Persian, Pashto, Urdu, Kashmiri ...
scripts such as the Hebrew and Arabic alphabets). The vertical and horizontal tab characters (VT and HT/TAB) cause the output device to move the printing position to the next tab stop in the direction of reading. The form feed character (FF/NP) starts a new sheet of paper, and may or may not move to the start of the first line. The backspace character (BS) moves the printing position one character space backwards. On printers, including hard-copy terminals, this is most often used so the printer can overprint characters to make other, not normally available, characters. On video terminals and other electronic output devices, there are often software (or hardware) configuration choices that allow a destructive backspace (e.g., a BS, SP, BS sequence), which erases, or a non-destructive one, which does not. The shift in and shift out characters (SI and SO) selected alternate character sets, fonts, underlining, or other printing modes. Escape sequences were often used to do the same thing. With the advent of
computer terminal A computer terminal is an electronic or electromechanical hardware device that can be used for entering data into, and transcribing data from, a computer or a computing system. The teletype was an example of an early-day hard-copy terminal and ...
s that did not physically print on paper and so offered more flexibility regarding screen placement, erasure, and so forth, printing control codes were adapted. Form feeds, for example, usually cleared the screen, there being no new paper page to move to. More complex escape sequences were developed to take advantage of the flexibility of the new terminals, and indeed of newer printers. The concept of a control character had always been somewhat limiting, and was extremely so when used with new, much more flexible, hardware. Control sequences (sometimes implemented as escape sequences) could match the new flexibility and power and became the standard method. However, there were, and remain, a large variety of standard sequences to choose from.


Data structuring

The separators (File, Group, Record, and Unit: FS, GS, RS and US) were made to structure data, usually on a tape, in order to simulate
punched card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s. End of medium (EM) warns that the tape (or other recording medium) is ending. While many systems use CR/LF and TAB for structuring data, it is possible to encounter the separator control characters in data that needs to be structured. The separator control characters are not overloaded; there is no general use of them except to separate data into structured groupings. Their numeric values are contiguous with the space character, which can be considered a member of the group, as a word separator. For example, the RS separator is used by (JSON Text Sequences) to encode a sequence of JSON elements. Each sequence item starts with a RS character and ends with a line feed. This allows to serialize open-ended JSON sequences. It is one of the JSON streaming protocols.


Transmission control

The transmission control characters were intended to structure a data stream, and to manage re-transmission or graceful failure, as needed, in the face of transmission errors. The start of heading (SOH) character was to mark a non-data section of a data stream—the part of a stream containing addresses and other housekeeping data. The start of text character (STX) marked the end of the header, and the start of the textual part of a stream. The end of text character (ETX) marked the end of the data of a message. A widely used convention is to make the two characters preceding ETX a checksum or CRC for error-detection purposes. The end of transmission block character (ETB) was used to indicate the end of a block of data, where data was divided into such blocks for transmission purposes. The escape character ( ESC) was intended to "quote" the next character, if it was another control character it would print it instead of performing the control function. It is almost never used for this purpose today. Various printable characters are used as visible "
escape character In computing and telecommunication, an escape character is a character (computing), character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharac ...
s", depending on context. The substitute character ( SUB) was intended to request a translation of the next character from a printable character to another value, usually by setting bit 5 to zero. This is handy because some media (such as sheets of paper produced by typewriters) can transmit only printable characters. However, on MS-DOS systems with files opened in text mode, "end of text" or "end of file" is marked by this
Ctrl-Z In computer data, a substitute character (␚) is a control character that is used to pad transmitted data in order to send it in blocks of fixed size, or to stand in place of a character that is recognized to be invalid, erroneous or unreprese ...
character, instead of the Ctrl-C or Ctrl-D, which are common on other operating systems. The cancel character ( CAN) signalled that the previous element should be discarded. The negative acknowledge character (
NAK In data networking, telecommunications, and computer buses, an acknowledgment (ACK) is a signal that is passed between communicating processes, computers, or devices to signify acknowledgment, or receipt of message, as part of a communicatio ...
) is a definite flag for, usually, noting that reception was a problem, and, often, that the current element should be sent again. The acknowledge character ( ACK) is normally used as a flag to indicate no problem detected with current element. When a transmission medium is half duplex (that is, it can transmit in only one direction at a time), there is usually a master station that can transmit at any time, and one or more slave stations that transmit when they have permission. The enquire character ( ENQ) is generally used by a master station to ask a slave station to send its next message. A slave station indicates that it has completed its transmission by sending the end of transmission character ( EOT). The device control codes (DC1 to DC4) were originally generic, to be implemented as necessary by each device. However, a universal need in data transmission is to request the sender to stop transmitting when a receiver is temporarily unable to accept any more data.
Digital Equipment Corporation Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president unt ...
invented a convention which used 19 (the device control 3 character ( DC3), also known as control-S, or XOFF) to "S"top transmission, and 17 (the device control 1 character ( DC1), a.k.a. control-Q, or
XON Software flow control is a method of flow control used in computer data links, especially RS-232 serial. It uses special codes, transmitted in-band, over the primary communications channel. These codes are generally called XOFF and XON (from ...
) to start transmission. It has become so widely used that most don't realize it is not part of official ASCII. This technique, however implemented, avoids additional wires in the data cable devoted only to transmission management, which saves money. A sensible protocol for the use of such transmission flow control signals must be used, to avoid potential deadlock conditions, however. The data link escape character ( DLE) was intended to be a signal to the other end of a data link that the following character is a control character such as STX or ETX. For example a packet may be structured in the following way ( DLE) ( DLE) .


Miscellaneous codes

Code 7 ( BEL) is intended to cause an audible signal in the receiving terminal. An old RFC, which explains the structure and meaning of the control characters in chapters 4.1 and 5.2 Many of the ASCII control characters were designed for devices of the time that are not often seen today. For example, code 22, "synchronous idle" ( SYN), was originally sent by synchronous modems (which have to send data constantly) when there was no actual data to send. (Modern systems typically use a start bit to announce the beginning of a transmitted word— this is a feature of ''asynchronous'' communication. ''Synchronous'' communication links were more often seen with mainframes, where they were typically run over corporate leased lines to connect a mainframe to another mainframe or perhaps a minicomputer.) Code 0 (ASCII code name NUL) is a special case. In paper tape, it is the case when there are no holes. It is convenient to treat this as a ''fill'' character with no meaning otherwise. Since the position of a NUL character has no holes punched, it can be replaced with any other character at a later time, so it was typically used to reserve space, either for correcting errors or for inserting information that would be available at a later time or in another place. In computing it is often used for padding in fixed length records and more commonly, to mark the end of a string. Code 127 (
DEL Del, or nabla, is an operator used in mathematics (particularly in vector calculus) as a vector differential operator, usually represented by the nabla symbol ∇. When applied to a function defined on a one-dimensional domain, it denotes ...
, a.k.a. "rubout") is likewise a special case. Its 7-bit code is ''all-bits-on'' in binary, which essentially erased a character cell on a
paper tape Five- and eight-hole punched paper tape Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop Punched tape or perforated paper tape is a form of data storage ...
when overpunched. Paper tape was a common storage medium when ASCII was developed, with a computing history dating back to WWII code breaking equipment at Biuro Szyfrów. Paper tape became obsolete in the 1970s, so this clever aspect of ASCII rarely saw any use after that. Some systems (such as the original Apples) converted it to a backspace. But because its code is in the range occupied by other printable characters, and because it had no official assigned glyph, many computer equipment vendors used it as an additional printable character (often an all-black "box" character useful for erasing text by overprinting with ink). Non-erasable
programmable ROM A programmable read-only memory (PROM) is a form of digital memory where the contents can be changed once after manufacture of the device. The data is then permanent and cannot be changed. It is one type of read-only memory (ROM). PROMs are used ...
s are typically implemented as arrays of fusible elements, each representing a
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
, which can only be switched one way, usually from one to zero. In such PROMs, the DEL and NUL characters can be used in the same way that they were used on punched tape: one to reserve meaningless fill bytes that can be written later, and the other to convert written bytes to meaningless fill bytes. For PROMs that switch one to zero, the roles of NUL and DEL are reversed; also, DEL will only work with 7-bit characters, which are rarely used today; for 8-bit content, the character code 255, commonly defined as a nonbreaking space character, can be used instead of DEL. Many
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
s do not allow control characters in
filename A filename or file name is a name used to uniquely identify a computer file in a directory structure. Different file systems impose different restrictions on filename lengths. A filename may (depending on the file system) include: * name &ndas ...
s, as they may have reserved functions.


See also

* Arrow keys#HJKL keys HJKL as arrow keys, used on ADM-3A terminal *
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
*
Escape sequence In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding (and possibly terminating) characters. Examples * In C and man ...
*
In-band signaling In telecommunications, in-band signaling is the sending of control information within the same band or channel used for data such as voice or video. This is in contrast to out-of-band signaling which is sent over a different channel, or even ov ...
*
Whitespace character In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area ...


Notes and references


External links


ISO IR 1
C0 Set of ISO 646 (PDF) {{Authority control