In
telecommunications
Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communication technologies. These means of ...
, a line code is a pattern of voltage, current, or photons used to represent digital data
transmitted down a
communication channel
A communication channel refers either to a physical transmission medium such as a wire, or to a logical connection over a multiplexed medium such as a radio channel in telecommunications and computer networking. A channel is used for infor ...
or written to a
storage medium. This repertoire of signals is usually called a constrained code in data storage systems.
Some signals are more prone to error than others as the physics of the communication channel or storage medium constrains the repertoire of signals that can be used reliably.
Common line encodings are
unipolar,
polar,
bipolar, and
Manchester code.
Transmission and storage
After line coding, the signal is put through a physical communication channel, either a
transmission medium
A transmission medium is a system or substance that can mediate the propagation of signals for the purposes of telecommunication. Signals are typically imposed on a wave of some kind suitable for the chosen medium. For example, data can modula ...
or
data storage medium.
[Karl Paulsen]
"Coding for Magnetic Storage Mediums"
.2007. The most common physical channels are:
* the line-coded signal can directly be put on a
transmission line
In electrical engineering, a transmission line is a specialized cable or other structure designed to conduct electromagnetic waves in a contained manner. The term applies when the conductors are long enough that the wave nature of the transmis ...
, in the form of variations of the voltage or current (often using
differential signaling
Differential signalling is a method for electrically transmitting information using two complementary signals. The technique sends the same electrical signal as a differential pair of signals, each in its own conductor. The pair of conduc ...
).
* the line-coded signal (the ''
baseband signal'') undergoes further
pulse shaping (to reduce its frequency bandwidth) and then is
modulated (to shift its frequency) to create an ''
RF signal'' that can be sent through free space.
* the line-coded signal can be used to turn on and off a light source in
free-space optical communication
Free-space optical communication (FSO) is an optical communication technology that uses light propagating in free space to wirelessly transmit data for telecommunications or computer networking over long distances. "Free space" means air, oute ...
, most commonly used in an infrared
remote control
A remote control, also known colloquially as a remote or clicker, is an consumer electronics, electronic device used to operate another device from a distance, usually wirelessly. In consumer electronics, a remote control can be used to operat ...
.
* the line-coded signal can be printed on paper to create a
bar code
A barcode or bar code is a method of representing data in a visual, Machine-readable data, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly ref ...
.
* the line-coded signal can be converted to magnetized spots on a
hard drive
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
or
tape drive
A tape drive is a data storage device that reads and writes data on a magnetic tape. Magnetic-tape data storage is typically used for offline, archival data storage. Tape media generally has a favorable unit cost and long archival stability.
...
.
* the line-coded signal can be converted to pits on an
optical disc
An optical disc is a flat, usuallyNon-circular optical discs exist for fashion purposes; see shaped compact disc. disc-shaped object that stores information in the form of physical variations on its surface that can be read with the aid o ...
.
Some of the more common binary line codes include:

Each line code has advantages and disadvantages. Line codes are chosen to meet one or more of the following criteria:
* Minimize transmission hardware
* Facilitate synchronization
* Ease error detection and correction
* Achieve a target
spectral density
In signal processing, the power spectrum S_(f) of a continuous time signal x(t) describes the distribution of power into frequency components f composing that signal. According to Fourier analysis, any physical signal can be decomposed into ...
* Eliminate a
DC component
In signal processing, when describing a periodic function in the time domain, the DC bias, DC component, DC offset, or DC coefficient is the mean value of the waveform. A waveform with zero mean or no DC bias is known as a ''DC balanced'' or ''DC ...
Disparity
Most long-distance communication channels cannot reliably transport a
DC component
In signal processing, when describing a periodic function in the time domain, the DC bias, DC component, DC offset, or DC coefficient is the mean value of the waveform. A waveform with zero mean or no DC bias is known as a ''DC balanced'' or ''DC ...
. The DC component is also called the ''disparity'', the ''bias'', or the
DC coefficient. The disparity of a bit pattern is the difference in the number of one bits vs the number of zero bits. The ''running disparity'' is the
running total of the disparity of all previously transmitted bits. The simplest possible line code,
unipolar, gives too many errors on such systems, because it has an unbounded DC component.
Most line codes eliminate the DC component such codes are called
DC-balanced
In signal processing, when describing a periodic function in the time domain, the DC bias, DC component, DC offset, or DC coefficient is the mean value of the waveform. A waveform with zero mean or no DC bias is known as a ''DC balanced'' or ''DC ...
, zero-DC, or DC-free. There are three ways of eliminating the DC component:
* Use a
constant-weight code
In coding theory, a constant-weight code, also called an ''m''-of-''n'' code or ''m''-out-of-''n'' code, is an error detection and correction code where all codewords share the same Hamming weight.
The one-hot code and the balanced code are two ...
. Each transmitted
code word in a constant-weight code is designed such that every code word that contains some positive or negative levels also contains enough of the opposite levels, such that the average level over each code word is zero. Examples of constant-weight codes include
Manchester code and
Interleaved 2 of 5.
* Use a
paired disparity code. Each code word in a paired disparity code that averages to a negative level is paired with another code word that averages to a positive level. The transmitter keeps track of the running DC buildup, and picks the code word that pushes the DC level back towards zero. The receiver is designed so that either code word of the pair decodes to the same data bits. Examples of paired disparity codes include
alternate mark inversion
In telecommunication, bipolar encoding is a type of return-to-zero (RZ) line code, where two nonzero values are used, so that the three values are +, −, and zero. Such a signal is called a duobinary signal. Standard bipolar encodings are designed ...
,
8b/10b and
4B3T
4B3T, which stands for 4 (four) binary 3 (three) ternary, is a line encoding scheme used for ISDN PRI interface. 4B3T represents four binary bits using three pulses.
Description
It uses three output levels:
* + (positive pulse),
* 0 (no pulse) ...
.
* Use a
scrambler
In telecommunications, a scrambler is a device that transposes or inverts signals or otherwise encodes a message at the sender's side to make the message unintelligible at a receiver not equipped with an appropriately set descrambling device. Wher ...
. For example, the scrambler specified in for
64b/66b encoding.
Polarity
Bipolar line codes have two polarities, are generally implemented as RZ, and have a radix of three since there are three distinct output levels (negative, positive and zero). One of the principal advantages of this type of code is that it can eliminate any DC component. This is important if the signal must pass through a transformer or a long transmission line.
Unfortunately, several long-distance communication channels have polarity ambiguity. Polarity-insensitive line codes compensate in these channels.
There are three ways of providing unambiguous reception of 0 and 1 bits over such channels:
* Pair each code word with the polarity-inverse of that code word. The receiver is designed so that either code word of the pair decodes to the same data bits. Examples include
alternate mark inversion
In telecommunication, bipolar encoding is a type of return-to-zero (RZ) line code, where two nonzero values are used, so that the three values are +, −, and zero. Such a signal is called a duobinary signal. Standard bipolar encodings are designed ...
,
Differential Manchester encoding
Differential Manchester encoding (DM) is a line code in digital frequency modulation in which data and clock signals are combined to form a single two-level self- synchronizing data stream. Each data bit is encoded by a presence or absence of ...
,
coded mark inversion
file:Cmi.gif, frame, CMI line coding
In telecommunication, coded mark inversion (CMI) is a non-return-to-zero (NRZ) line code. It encodes ''zero'' bits as a half bit time of zero followed by a half bit time of one, and while ''one'' bits are encod ...
and
Miller encoding.
*
differential coding each symbol relative to the previous symbol. Examples include
MLT-3 encoding
MLT-3 encoding (Multi-Level Transmit) is a line code (a signaling method used in a telecommunication system for transmission purposes) that uses three voltage levels. An MLT-3 interface emits less electromagnetic interference and requires less ba ...
and
NRZI.
* Invert the whole stream when inverted
syncword
In computer networks, a syncword, sync character, sync sequence or preamble is used to synchronize a data transmission by indicating the end of header information and the start of data. The syncword is a known sequence of data used to identif ...
s are detected, perhaps using
polarity switching
Run-length limited codes
For reliable
clock recovery
Clock recovery is a process in serial communication used to extract timing information from a stream of serial data being sent in order to accurately determine payload sequence without separate clock information. It is widely used in data communi ...
at the receiver, a
run-length limitation may be imposed on the generated channel sequence, i.e., the maximum number of consecutive ones or zeros is bounded to a reasonable number. A clock period is recovered by observing transitions in the received sequence, so that a maximum run length guarantees sufficient transitions to assure clock recovery quality.
RLL codes are defined by four main parameters: ''m'', ''n'', ''d'', ''k''. The first two, ''m''/''n'', refer to the rate of the code, while the remaining two specify the minimal ''d'' and maximal ''k'' number of zeroes between consecutive ones. This is used in both
telecommunications
Telecommunication, often used in its plural form or abbreviated as telecom, is the transmission of information over a distance using electronic means, typically through cables, radio waves, or other communication technologies. These means of ...
and storage systems that move a medium past a fixed
recording head
''Tape Head'' is the seventh studio album by American rock band King's X
King's X is an American Rock music, rock band formed in Springfield, Missouri, in 1979. They were first called the Edge and later became Sneak Preview before settli ...
.
Specifically, RLL bounds the length of stretches (runs) of repeated bits during which the signal does not change. If the runs are too long, clock recovery is difficult; if they are too short, the high frequencies might be attenuated by the communications channel. By
modulating the
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
, RLL reduces the timing uncertainty in decoding the stored data, which would lead to the possible erroneous insertion or removal of bits when reading the data back. This mechanism ensures that the boundaries between bits can always be accurately found (preventing
bit slip
In digital transmission, bit slip is the loss or gain of a bit or bits, caused by clock driftvariations in the respective clock rates of the transmitting and receiving devices.
One cause of bit slip is overflow of a receive buffer that occu ...
), while efficiently using the media to reliably store the maximal amount of data in a given space.
Early disk drives used very simple encoding schemes, such as RLL (0,1) FM code, followed by RLL (1,3) MFM code which were widely used in
hard disk drive
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
s until the mid-1980s and are still used in digital optical discs such as
CD,
DVD
The DVD (common abbreviation for digital video disc or digital versatile disc) is a digital optical disc data storage format. It was invented and developed in 1995 and first released on November 1, 1996, in Japan. The medium can store any ki ...
,
MD,
Hi-MD
Hi-MD is a magneto-optical disc-based data storage format. It was a further development of the MiniDisc. With its release in late 2004, and
Blu-ray
Blu-ray (Blu-ray Disc or BD) is a digital optical disc data storage format designed to supersede the DVD format. It was invented and developed in 2005 and released worldwide on June 20, 2006, capable of storing several hours of high-defin ...
using
EFM and
EFMPLus codes. Higher density RLL (2,7) and RLL (1,7) codes became the
de facto standard
A ''de facto'' standard is a custom or convention that is commonly used even though its use is not required.
is a Latin phrase (literally " of fact"), here meaning "in practice but not necessarily ordained by law" or "in practice or actuality, ...
s for hard disks by the early 1990s.
Synchronization
Line coding should make it possible for the receiver to synchronize itself to the
phase
Phase or phases may refer to:
Science
*State of matter, or phase, one of the distinct forms in which matter can exist
*Phase (matter), a region of space throughout which all physical properties are essentially uniform
*Phase space, a mathematica ...
of the received signal. If the clock recovery is not ideal, then the signal to be decoded will not be sampled at the optimal times. This will increase the probability of error in the received data.
Biphase line codes require at least one transition per bit time. This makes it easier to synchronize the transceivers and detect errors, however, the baud rate is greater than that of NRZ codes.
Other considerations
A line code will typically reflect technical requirements of the transmission medium, such as
optical fiber
An optical fiber, or optical fibre, is a flexible glass or plastic fiber that can transmit light from one end to the other. Such fibers find wide usage in fiber-optic communications, where they permit transmission over longer distances and at ...
or
shielded twisted pair. These requirements are unique for each medium, because each one has different behavior related to interference, distortion, capacitance and attenuation.
Common line codes
*
2B1Q
Two-binary, one-quaternary (2B1Q) is a line code used in the U interface of the Integrated Services Digital Network (ISDN) Basic Rate Interface (BRI) and the high-bit-rate digital subscriber line (HDSL). 2B1Q is a four-level pulse-amplitude modul ...
*
4B3T
4B3T, which stands for 4 (four) binary 3 (three) ternary, is a line encoding scheme used for ISDN PRI interface. 4B3T represents four binary bits using three pulses.
Description
It uses three output levels:
* + (positive pulse),
* 0 (no pulse) ...
*
4B5B
In telecommunications, 4B5B is a form of data communications line code. 4B5B maps groups of 4 bits of data onto groups of 5 bits for transmission. These 5-bit words are predetermined in a dictionary and they are chosen to ensure that there will b ...
*
6b/8b encoding
*
8b/10b encoding
In telecommunications, 8b/10b is a line code that maps 8-bit words to 10-bit symbols to achieve DC balance and bounded disparity, and at the same time provide enough state changes to allow reasonable clock recovery. This means that the di ...
*
64b/66b encoding
*
128b/130b encoding
*
Alternate mark inversion
In telecommunication, bipolar encoding is a type of return-to-zero (RZ) line code, where two nonzero values are used, so that the three values are +, −, and zero. Such a signal is called a duobinary signal. Standard bipolar encodings are designed ...
(AMI)
*
Coded mark inversion
file:Cmi.gif, frame, CMI line coding
In telecommunication, coded mark inversion (CMI) is a non-return-to-zero (NRZ) line code. It encodes ''zero'' bits as a half bit time of zero followed by a half bit time of one, and while ''one'' bits are encod ...
(CMI)
*
EFMPlus, used in
DVD
The DVD (common abbreviation for digital video disc or digital versatile disc) is a digital optical disc data storage format. It was invented and developed in 1995 and first released on November 1, 1996, in Japan. The medium can store any ki ...
s
*
Eight-to-fourteen modulation (EFM), used in
compact disc
The compact disc (CD) is a Digital media, digital optical disc data storage format co-developed by Philips and Sony to store and play digital audio recordings. It employs the Compact Disc Digital Audio (CD-DA) standard and was capable of hol ...
s
*
Hamming code
In computer science and telecommunications, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the ...
*
Hybrid ternary code
*
Manchester code and
differential Manchester
*
Mark and space
*
MLT-3 encoding
MLT-3 encoding (Multi-Level Transmit) is a line code (a signaling method used in a telecommunication system for transmission purposes) that uses three voltage levels. An MLT-3 interface emits less electromagnetic interference and requires less ba ...
*
Modified AMI code
Modified AMI codes are a digital telecommunications technique to maintain system synchronization. Alternate mark inversion (AMI) line codes are modified by deliberate insertion of bipolar violations. There are several types of modified AMI codes, ...
s: B8ZS, B6ZS, B3ZS, HDB3
*
Modified frequency modulation, Miller encoding and delay encoding
*
Non-return-to-zero
In telecommunications, a non-return-to-zero (NRZ) line code is a binary code in which ones are represented by one significant condition, usually a positive voltage, while zeros are represented by some other significant condition, usually a ne ...
(NRZ)
*
Non-return-to-zero, inverted
In telecommunications, a non-return-to-zero (NRZ) line code is a Binary coding, binary code in which ones are represented by one significant condition, usually a positive voltage, while zeros are represented by some other significant condition, ...
(NRZI)
*
Pulse-position modulation
Pulse-position modulation (PPM) is a form of signal modulation in which ''M'' message bits are encoded by transmitting a single pulse in one of 2^M possible required time shifts. This is repeated every ''T'' seconds, such that the transmitted b ...
(PPM)
*
Return-to-zero
Return-to-zero (RZ or RTZ) describes a line code used in telecommunications signals in which the signal drops (returns) to zero between pulses. This takes place even if a number of consecutive 0s or 1s occur in the signal. The signal is se ...
(RZ)
*
TC-PAM
Optical line codes
*
Alternate-Phase Return-to-Zero (APRZ)
*
Carrier-Suppressed Return-to-Zero (CSRZ)
*
Three of Six, Fiber Optical (TS-FO)
See also
*
Physical layer
In the seven-layer OSI model of computer networking, the physical layer or layer 1 is the first and lowest layer: the layer most closely associated with the physical connection between devices. The physical layer provides an electrical, mechani ...
*
Self-synchronizing code
In coding theory, especially in telecommunications, a self-synchronizing code is a uniquely decodable code in which the symbol stream formed by a portion of one code word, or by the overlapped portion of any two adjacent code words, is not a ...
and bit synchronization
References
*
External links
Line Coding Lecture No. 9
Line Coding in Digital CommunicationCodSim 2.0: Open source simulator for Digital Data Communications Model at the University of Malaga written in HTML
{{Bit-encoding
*
Physical layer protocols
Coding theory