In
cryptography
Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adver ...
, a substitution cipher is a method of
encrypting in which units of
plaintext
In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted.
Overview
With the advent of comp ...
are replaced with the
ciphertext
In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext ...
, in a defined manner, with the help of a key; the "units" may be single letters (the most common), pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing the inverse substitution process to extract the original message.
Substitution ciphers can be compared with
transposition ciphers. In a transposition cipher, the units of the plaintext are rearranged in a different and usually quite complex order, but the units themselves are left unchanged. By contrast, in a substitution cipher, the units of the plaintext are retained in the same sequence in the ciphertext, but the units themselves are altered.
There are a number of different types of substitution cipher. If the cipher operates on single letters, it is termed a simple substitution cipher; a cipher that operates on larger groups of letters is termed polygraphic. A monoalphabetic cipher uses fixed substitution over the entire message, whereas a polyalphabetic cipher uses a number of substitutions at different positions in the message, where a unit from the plaintext is mapped to one of several possibilities in the ciphertext and vice versa.
__TOC__
Simple substitution
Substitution of single letters separately—simple substitution—can be demonstrated by writing out the alphabet in some order to represent the substitution. This is termed a substitution alphabet. The cipher alphabet may be shifted or reversed (creating the
Caesar
Gaius Julius Caesar (; ; 12 July 100 BC – 15 March 44 BC), was a Roman people, Roman general and statesman. A member of the First Triumvirate, Caesar led the Roman armies in the Gallic Wars before defeating his political rival Pompey in Caes ...
and
Atbash
Atbash ( he, אתבש; also transliterated Atbaš) is a monoalphabetic substitution cipher originally used to encrypt the Hebrew alphabet. It can be modified for use with any known writing system with a standard collating order.
Encryption
Th ...
ciphers, respectively) or scrambled in a more complex fashion, in which case it is called a ''mixed alphabet'' or ''deranged alphabet''. Traditionally, mixed alphabets may be created by first writing out a keyword, removing repeated letters in it, then writing all the remaining letters in the alphabet in the usual order.
Using this system, the keyword "" gives us the following alphabets:
A message
flee at once. we are discovered!
enciphers to
SIAA ZQ LKBA. VA ZOA RFPBLUAOAR!
Usually the ciphertext is written out in blocks of fixed length, omitting punctuation and spaces; this is done to disguise word boundaries from the
plaintext
In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted.
Overview
With the advent of comp ...
and to help avoid transmission errors. These blocks are called "groups", and sometimes a "group count" (i.e. the number of groups) is given as an additional check. Five-letter groups are often used, dating from when messages used to be transmitted by
telegraph
Telegraphy is the long-distance transmission of messages where the sender uses symbolic codes, known to the recipient, rather than a physical exchange of an object bearing the message. Thus flag semaphore is a method of telegraphy, whereas p ...
:
SIAAZ QLKBA VAZOA RFPBL UAOAR
If the length of the message happens not to be divisible by five, it may be padded at the end with "
nulls". These can be any characters that decrypt to obvious nonsense, so that the receiver can easily spot them and discard them.
The ciphertext alphabet is sometimes different from the plaintext alphabet; for example, in the
pigpen cipher, the ciphertext consists of a set of symbols derived from a grid. For example:
Such features make little difference to the security of a scheme, however – at the very least, any set of strange symbols can be transcribed back into an A-Z alphabet and dealt with as normal.
In lists and catalogues for salespeople, a very simple encryption is sometimes used to replace numeric digits by letters.
Example: MAT would be used to represent 120.
Security for simple substitution ciphers
Although the traditional keyword method for creating a mixed substitution alphabet is simple, a serious disadvantage is that the last letters of the alphabet (which are mostly low frequency) tend to stay at the end. A stronger way of constructing a mixed alphabet is to generate the substitution alphabet completely randomly.
Although the number of possible substitution alphabets is very large (26! ≈ 2
88.4, or about
88 bits), this cipher is not very strong, and is easily broken. Provided the message is of reasonable length (see below), the
cryptanalyst can deduce the probable meaning of the most common symbols by analyzing the
frequency distribution of the ciphertext. This allows formation of partial words, which can be tentatively filled in, progressively expanding the (partial) solution (see
frequency analysis
In cryptanalysis, frequency analysis (also known as counting letters) is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers.
Frequency analysis is based on t ...
for a demonstration of this). In some cases, underlying words can also be determined from the pattern of their letters; for example, ''attract'', ''osseous'', and words with those two as the root are the only common
English
English usually refers to:
* English language
* English people
English may also refer to:
Peoples, culture, and language
* ''English'', an adjective for something of, from, or related to England
** English national ide ...
words with the pattern ''ABBCADB''. Many people solve such ciphers for recreation, as with
cryptogram puzzles in the newspaper.
According to the
unicity distance of
English
English usually refers to:
* English language
* English people
English may also refer to:
Peoples, culture, and language
* ''English'', an adjective for something of, from, or related to England
** English national ide ...
, 27.6 letters of ciphertext are required to crack a mixed alphabet simple substitution. In practice, typically about 50 letters are needed, although some messages can be broken with fewer if unusual patterns are found. In other cases, the plaintext can be contrived to have a nearly flat frequency distribution, and much longer plaintexts will then be required by the cryptanalyst.
Nomenclator
One once-common variant of the substitution cipher is the nomenclator. Named after the public official who announced the titles of visiting dignitaries, this
cipher
In cryptography, a cipher (or cypher) is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is ''encipherment''. To encipher or encode i ...
uses a small
code
In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
sheet containing letter, syllable and word substitution tables, sometimes homophonic, that typically converted symbols into numbers. Originally the code portion was restricted to the names of important people, hence the name of the cipher; in later years, it covered many common words and place names as well. The symbols for whole words (''
codewords'' in modern parlance) and letters (''cipher'' in modern parlance) were not distinguished in the ciphertext. The
Rossignols'
Great Cipher used by
Louis XIV of France
, house = Bourbon
, father = Louis XIII
, mother = Anne of Austria
, birth_date =
, birth_place = Château de Saint-Germain-en-Laye, Saint-Germain-en-Laye, France
, death_date =
, death_place = Palace of Versa ...
was one.
Nomenclators were the standard fare of
diplomatic
Diplomatics (in American English, and in most anglophone countries), or diplomatic (in British English), is a scholarly discipline centred on the critical analysis of documents: especially, historical documents. It focuses on the conventions, p ...
correspondence,
espionage
Espionage, spying, or intelligence gathering is the act of obtaining secret or confidential information (intelligence) from non-disclosed sources or divulging of the same without the permission of the holder of the information for a tangibl ...
, and advanced political
conspiracy
A conspiracy, also known as a plot, is a secret plan or agreement between persons (called conspirers or conspirators) for an unlawful or harmful purpose, such as murder or treason, especially with political motivation, while keeping their agree ...
from the early fifteenth century to the late eighteenth century; most conspirators were and have remained less cryptographically sophisticated. Although
government
A government is the system or group of people governing an organized community, generally a state.
In the case of its broad associative definition, government normally consists of legislature, executive, and judiciary. Government is a ...
intelligence
Intelligence has been defined in many ways: the capacity for abstraction, logic, understanding, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, critical thinking, and problem-solving. More generally, it can b ...
cryptanalysts were systematically breaking nomenclators by the mid-sixteenth century, and superior systems had been available since 1467, the usual response to
cryptanalysis
Cryptanalysis (from the Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic sec ...
was simply to make the tables larger. By the late eighteenth century, when the system was beginning to die out, some nomenclators had 50,000 symbols.
Nevertheless, not all nomenclators were broken; today, cryptanalysis of archived ciphertexts remains a fruitful area of
historical research.
Homophonic substitution
An early attempt to increase the difficulty of frequency analysis attacks on substitution ciphers was to disguise plaintext letter frequencies by
homophony. In these ciphers, plaintext letters map to more than one ciphertext symbol. Usually, the highest-frequency plaintext symbols are given more equivalents than lower frequency letters. In this way, the frequency distribution is flattened, making analysis more difficult.
Since more than 26 characters will be required in the ciphertext alphabet, various solutions are employed to invent larger alphabets. Perhaps the simplest is to use a numeric substitution 'alphabet'. Another method consists of simple variations on the existing alphabet; uppercase, lowercase, upside down, etc. More artistically, though not necessarily more securely, some homophonic ciphers employed wholly invented alphabets of fanciful symbols.
The
book cipher is a type of homophonic cipher, one example being the
Beale ciphers
The Beale ciphers are a set of three ciphertexts, one of which allegedly states the location of a buried treasure of gold, silver and jewels estimated to be worth over US$43 million Comprising three ciphertexts, the first (unsolved) text de ...
. This is a story of buried treasure that was described in 1819–21 by use of a ciphered text that was keyed to the Declaration of Independence. Here each ciphertext character was represented by a number. The number was determined by taking the plaintext character and finding a word in the Declaration of Independence that started with that character and using the numerical position of that word in the Declaration of Independence as the encrypted form of that letter. Since many words in the Declaration of Independence start with the same letter, the encryption of that character could be any of the numbers associated with the words in the Declaration of Independence that start with that letter. Deciphering the encrypted text character ''X'' (which is a number) is as simple as looking up the Xth word of the Declaration of Independence and using the first letter of that word as the decrypted character.
Another homophonic cipher was described by Stahl and was one of the first attempts to provide for computer security of data systems in computers through encryption. Stahl constructed the cipher in such a way that the number of homophones for a given character was in proportion to the frequency of the character, thus making frequency analysis much more difficult.
Francesco I Gonzaga,
Duke of Mantua
During its history as independent entity, Mantua had different rulers who governed on the city and the lands of Mantua from the Middle Ages to the early modern period.
From 970 to 1115, the Counts of Mantua were members of the House of Canoss ...
, used the earliest known example of a homophonic substitution cipher in 1401 for correspondence with one Simone de Crema.
Polyalphabetic substitution
The work of
Al-Qalqashandi (1355-1418), based on the earlier work of
Ibn al-Durayhim (1312–1359), contained the first published discussion of the substitution and transposition of ciphers, as well as the first description of a polyalphabetic cipher, in which each plaintext letter is assigned more than one substitute. Polyalphabetic substitution ciphers were later described in 1467 by
Leone Battista Alberti in the form of disks.
Johannes Trithemius
Johannes Trithemius (; 1 February 1462 – 13 December 1516), born Johann Heidenberg, was a German Benedictine abbot and a polymath who was active in the German Renaissance as a lexicographer, chronicler, cryptographer, and occultist. He is consi ...
, in his book ''Steganographia'' (
Ancient Greek
Ancient Greek includes the forms of the Greek language used in ancient Greece and the ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Dark Ages (), the Archaic peri ...
for "hidden writing") introduced the now more standard form of a ''tableau'' (see below; ca. 1500 but not published until much later). A more sophisticated version using mixed alphabets was described in 1563 by
Giovanni Battista della Porta
Giambattista della Porta (; 1535 – 4 February 1615), also known as Giovanni Battista Della Porta, was an Italian scholar, polymath and playwright who lived in Naples at the time of the Renaissance, Scientific Revolution and Reformation.
Giamb ...
in his book, ''
De Furtivis Literarum Notis'' (
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
for "On concealed characters in writing").
In a polyalphabetic cipher, multiple cipher alphabets are used. To facilitate encryption, all the alphabets are usually written out in a large
table
Table may refer to:
* Table (furniture), a piece of furniture with a flat surface and one or more legs
* Table (landform), a flat area of land
* Table (information), a data arrangement with rows and columns
* Table (database), how the table data ...
, traditionally called a ''tableau''. The tableau is usually 26×26, so that 26 full ciphertext alphabets are available. The method of filling the tableau, and of choosing which alphabet to use next, defines the particular polyalphabetic cipher. All such ciphers are easier to break than once believed, as substitution alphabets are repeated for sufficiently large plaintexts.
One of the most popular was that of
Blaise de Vigenère. First published in 1585, it was considered unbreakable until 1863, and indeed was commonly called ''le chiffre indéchiffrable'' (
French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
** French language, which originated in France, and its various dialects and accents
** French people, a nation and ethnic group identified with Franc ...
for "indecipherable cipher").
In the
Vigenère cipher, the first row of the tableau is filled out with a copy of the plaintext alphabet, and successive rows are simply shifted one place to the left. (Such a simple tableau is called a ''
tabula recta'', and mathematically corresponds to adding the plaintext and key letters,
modulo
In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the '' modulus'' of the operation).
Given two positive numbers and , modulo (often abbreviated as ) is t ...
26.) A keyword is then used to choose which ciphertext alphabet to use. Each letter of the keyword is used in turn, and then they are repeated again from the beginning. So if the keyword is 'CAT', the first letter of plaintext is enciphered under alphabet 'C', the second under 'A', the third under 'T', the fourth under 'C' again, and so on. In practice, Vigenère keys were often phrases several words long.
In 1863,
Friedrich Kasiski published a method (probably discovered secretly and independently before the
Crimean War
The Crimean War, , was fought from October 1853 to February 1856 between Russia and an ultimately victorious alliance of the Ottoman Empire, France, the United Kingdom and Piedmont-Sardinia.
Geopolitical causes of the war included the de ...
by
Charles Babbage
Charles Babbage (; 26 December 1791 – 18 October 1871) was an English polymath. A mathematician, philosopher, inventor and mechanical engineer, Babbage originated the concept of a digital programmable computer.
Babbage is considered ...
) which enabled the calculation of the length of the keyword in a Vigenère ciphered message. Once this was done, ciphertext letters that had been enciphered under the same alphabet could be picked out and attacked separately as a number of semi-independent simple substitutions - complicated by the fact that within one alphabet letters were separated and did not form complete words, but simplified by the fact that usually a ''tabula recta'' had been employed.
As such, even today a Vigenère type cipher should theoretically be difficult to break if mixed alphabets are used in the tableau, if the keyword is random, and if the total length of ciphertext is less than 27.67 times the length of the keyword. These requirements are rarely understood in practice, and so Vigenère enciphered message security is usually less than might have been.
Other notable polyalphabetics include:
* The Gronsfeld cipher. This is identical to the Vigenère except that only 10 alphabets are used, and so the "keyword" is numerical.
* The
Beaufort cipher
The Beaufort cipher, created by Sir Francis Beaufort, is a substitution cipher similar to the Vigenère cipher, with a slightly modified enciphering mechanism and tableau. Its most famous application was in a rotor-based cipher machine, the Hag ...
. This is practically the same as the Vigenère, except the ''tabula recta'' is replaced by a backwards one, mathematically equivalent to ciphertext = key - plaintext. This operation is ''self-inverse'', whereby the same table is used for both encryption and decryption.
* The
autokey cipher
An autokey cipher (also known as the autoclave cipher) is a cipher that incorporates the message (the plaintext) into the key. The key is generated from the message in some automated fashion, sometimes by selecting certain letters from the text or ...
, which mixes plaintext with a key to avoid
periodicity.
* The
running key cipher
In classical cryptography, the running key cipher is a type of polyalphabetic substitution cipher in which a text, typically from a book, is used to provide a very long keystream. Usually, the book to be used would be agreed ahead of time, while ...
, where the key is made very long by using a passage from a book or similar text.
Modern
stream cipher
stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream (keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystream ...
s can also be seen, from a sufficiently abstract perspective, to be a form of polyalphabetic cipher in which all the effort has gone into making the
keystream In cryptography, a keystream is a stream of random or pseudorandom characters that are combined with a plaintext message to produce an encrypted message (the ciphertext).
The "characters" in the keystream can be bits, bytes, numbers or actual chara ...
as long and unpredictable as possible.
Polygraphic substitution
In a polygraphic substitution cipher, plaintext letters are substituted in larger groups, instead of substituting letters individually. The first advantage is that the frequency distribution is much flatter than that of individual letters (though not actually flat in real languages; for example, 'TH' is much more common than 'XQ' in English). Second, the larger number of symbols requires correspondingly more ciphertext to productively analyze letter frequencies.
To substitute ''pairs'' of letters would take a substitution alphabet 676 symbols long (
). In the same ''De Furtivis Literarum Notis'' mentioned above, della Porta actually proposed such a system, with a 20 x 20 tableau (for the 20 letters of the Italian/Latin alphabet he was using) filled with 400 unique
glyph
A glyph () is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A g ...
s. However the system was impractical and probably never actually used.
The earliest practical digraphic cipher (pairwise substitution), was the so-called
Playfair cipher, invented by Sir
Charles Wheatstone in 1854. In this cipher, a 5 x 5 grid is filled with the letters of a mixed alphabet (two letters, usually I and J, are combined). A digraphic substitution is then simulated by taking pairs of letters as two corners of a rectangle, and using the other two corners as the ciphertext (see the
Playfair cipher main article for a diagram). Special rules handle double letters and pairs falling in the same row or column. Playfair was in military use from the
Boer War
The Second Boer War ( af, Tweede Vryheidsoorlog, , 11 October 189931 May 1902), also known as the Boer War, the Anglo–Boer War, or the South African War, was a conflict fought between the British Empire and the two Boer Republics (the Sou ...
through
World War II
World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the vast majority of the world's countries—including all of the great powers—forming two opposin ...
.
Several other practical polygraphics were introduced in 1901 by
Felix Delastelle, including the
bifid
Bifid refers to something that is split or cleft into two parts. It may refer to:
* Bifid, a variation in the P wave, R wave, or T wave in an echocardiogram in which a wave which usually has a single peak instead has two separate peaks
* Bifid ci ...
and
four-square ciphers (both digraphic) and the
trifid cipher (probably the first practical trigraphic).
The
Hill cipher, invented in 1929 by
Lester S. Hill
Lester S. Hill (1891–1961) was an American mathematician and educator who was interested in applications of mathematics to communications. He received a bachelor's degree from Columbia College (1911) and a Ph.D. from Yale University (19 ...
, is a polygraphic substitution which can combine much larger groups of letters simultaneously using
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as:
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as:
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrices.
...
. Each letter is treated as a digit in
base 26: A = 0, B =1, and so on. (In a variation, 3 extra symbols are added to make the
basis
Basis may refer to:
Finance and accounting
*Adjusted basis, the net cost of an asset after adjusting for various tax-related items
*Basis point, 0.01%, often used in the context of interest rates
*Basis trading, a trading strategy consisting of ...
prime
A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
.) A block of n letters is then considered as a
vector of n
dimension
In physics and mathematics, the dimension of a Space (mathematics), mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any Point (geometry), point within it. Thus, a Line (geometry), lin ...
s, and multiplied by a n x n
matrix
Matrix most commonly refers to:
* ''The Matrix'' (franchise), an American media franchise
** ''The Matrix'', a 1999 science-fiction action film
** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
,
modulo
In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the '' modulus'' of the operation).
Given two positive numbers and , modulo (often abbreviated as ) is t ...
26. The components of the matrix are the key, and should be
random
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no :wikt:order, order and does not follow an intelligible pattern or combination. Ind ...
provided that the matrix is invertible in
(to ensure decryption is possible). A mechanical version of the Hill cipher of dimension 6 was patented in 1929.
The Hill cipher is vulnerable to a
known-plaintext attack
The known-plaintext attack (KPA) is an attack model for cryptanalysis where the attacker has access to both the plaintext (called a crib), and its encrypted version (ciphertext). These can be used to reveal further secret information such as secr ...
because it is completely
linear
Linearity is the property of a mathematical relationship (''function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear r ...
, so it must be combined with some
non-linear
In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...
step to defeat this attack. The combination of wider and wider weak, linear
diffusive
Molecular diffusion, often simply called diffusion, is the thermal motion of all (liquid or gas) particles at temperatures above absolute zero. The rate of this movement is a function of temperature, viscosity of the fluid and the size (mass) of ...
steps like a Hill cipher, with non-linear substitution steps, ultimately leads to a
substitution–permutation network
In cryptography, an SP-network, or substitution–permutation network (SPN), is a series of linked mathematical operations used in block cipher algorithms such as AES (Rijndael), 3-Way, Kalyna, Kuznyechik, PRESENT, SAFER, SHARK, and Square.
S ...
(e.g. a
Feistel cipher
In cryptography, a Feistel cipher (also known as Luby–Rackoff block cipher) is a symmetric structure used in the construction of block ciphers, named after the German-born physicist and cryptographer Horst Feistel, who did pioneering research whi ...
), so it is possible – from this extreme perspective – to consider modern
block cipher
In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits, called ''blocks''. Block ciphers are specified cryptographic primitive, elementary components in the design of many cryptographic protocols and ...
s as a type of polygraphic substitution.
Mechanical substitution ciphers
Between around
World War I
World War I (28 July 1914 11 November 1918), often abbreviated as WWI, was one of the deadliest global conflicts in history. Belligerents included much of Europe, the Russian Empire, the United States, and the Ottoman Empire, with fightin ...
and the widespread availability of
computer
A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
s (for some governments this was approximately the 1950s or 1960s; for other organizations it was a decade or more later; for individuals it was no earlier than 1975), mechanical implementations of polyalphabetic substitution ciphers were widely used. Several inventors had similar ideas about the same time, and
rotor cipher machine
In cryptography, a rotor machine is an electro-mechanical stream cipher device used for encrypting and decrypting messages. Rotor machines were the cryptographic state-of-the-art for much of the 20th century; they were in widespread use in the 19 ...
s were patented four times in 1919. The most important of the resulting machines was the
Enigma, especially in the versions used by the
German military from approximately 1930. The
Allies
An alliance is a relationship among people, groups, or states that have joined together for mutual benefit or to achieve some common purpose, whether or not explicit agreement has been worked out among them. Members of an alliance are called ...
also developed and used rotor machines (e.g.,
SIGABA
In the history of cryptography, the ECM Mark II was a cipher machine used by the United States for message encryption from World War II until the 1950s. The machine was also known as the SIGABA or Converter M-134 by the Army, or CSP-888/889 by the ...
and
Typex).
All of these were similar in that the substituted letter was chosen
electrically from amongst the huge number of possible combinations resulting from the rotation of several letter disks. Since one or more of the disks rotated mechanically with each plaintext letter enciphered, the number of alphabets used was astronomical. Early versions of these machine were, nevertheless, breakable.
William F. Friedman of the US Army's
SIS early found vulnerabilities in
Hebern's rotor machine, and
GC&CS
Government Communications Headquarters, commonly known as GCHQ, is an intelligence and security organisation responsible for providing signals intelligence (SIGINT) and information assurance (IA) to the government and armed forces of the Unit ...
's
Dillwyn Knox
Alfred Dillwyn "Dilly" Knox, CMG (23 July 1884 – 27 February 1943) was a British classics scholar and papyrologist at King's College, Cambridge and a codebreaker. As a member of the Room 40 codebreaking unit he helped decrypt the Zimme ...
solved versions of the Enigma machine (those without the "plugboard") well before
WWII
World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the vast majority of the world's countries—including all of the great powers—forming two opposin ...
began. Traffic protected by essentially all of the German military Enigmas was broken by Allied cryptanalysts, most notably those at
Bletchley Park
Bletchley Park is an English country house and estate in Bletchley, Milton Keynes ( Buckinghamshire) that became the principal centre of Allied code-breaking during the Second World War. The mansion was constructed during the years following ...
, beginning with the German Army variant used in the early 1930s. This version was broken by inspired mathematical insight by
Marian Rejewski
Marian Adam Rejewski (; 16 August 1905 – 13 February 1980) was a Polish mathematician and cryptologist who in late 1932 reconstructed the sight-unseen German military Enigma cipher machine, aided by limited documents obtained by French mili ...
in
Poland
Poland, officially the Republic of Poland, is a country in Central Europe. It is divided into 16 administrative provinces called voivodeships, covering an area of . Poland has a population of over 38 million and is the fifth-most populous ...
.
As far as is publicly known, no messages protected by the
SIGABA
In the history of cryptography, the ECM Mark II was a cipher machine used by the United States for message encryption from World War II until the 1950s. The machine was also known as the SIGABA or Converter M-134 by the Army, or CSP-888/889 by the ...
and
Typex machines were ever broken during or near the time when these systems were in service.
The one-time pad
One type of substitution cipher, the
one-time pad
In cryptography, the one-time pad (OTP) is an encryption technique that cannot be cracked, but requires the use of a single-use pre-shared key that is not smaller than the message being sent. In this technique, a plaintext is paired with a ran ...
, is unique. It was invented near the end of World War I by
Gilbert Vernam and
Joseph Mauborgne in the US. It was mathematically proven unbreakable by
Claude Shannon
Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American people, American mathematician, electrical engineering, electrical engineer, and cryptography, cryptographer known as a "father of information theory".
As a 21-year-o ...
, probably during
World War II
World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the vast majority of the world's countries—including all of the great powers—forming two opposin ...
; his work was first published in the late 1940s. In its most common implementation, the one-time pad can be called a substitution cipher only from an unusual perspective; typically, the plaintext letter is combined (not substituted) in some manner (e.g.,
XOR
Exclusive or or exclusive disjunction is a logical operation that is true if and only if its arguments differ (one is true, the other is false).
It is symbolized by the prefix operator J and by the infix operators XOR ( or ), EOR, EXOR, , ...
) with the key material character at that position.
The one-time pad is, in most cases, impractical as it requires that the key material be as long as the plaintext, ''actually''
random
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no :wikt:order, order and does not follow an intelligible pattern or combination. Ind ...
, used once and ''only'' once, and kept entirely secret from all except the sender and intended receiver. When these conditions are violated, even marginally, the one-time pad is no longer unbreakable.
Soviet
The Soviet Union,. officially the Union of Soviet Socialist Republics. (USSR),. was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 to 1991. A flagship communist state, ...
one-time pad messages sent from the US for a brief time during World War II used
non-random
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual ran ...
key material. US cryptanalysts, beginning in the late 40s, were able to, entirely or partially, break a few thousand messages out of several hundred thousand. (See
Venona project
The Venona project was a United States counterintelligence program initiated during World War II by the United States Army's Signal Intelligence Service (later absorbed by the National Security Agency), which ran from February 1, 1943, until Octob ...
)
In a mechanical implementation, rather like the
Rockex equipment, the one-time pad was used for messages sent on the
Moscow
Moscow ( , US chiefly ; rus, links=no, Москва, r=Moskva, p=mɐskˈva, a=Москва.ogg) is the capital and largest city of Russia. The city stands on the Moskva River in Central Russia, with a population estimated at 13.0 million ...
-
Washington
Washington commonly refers to:
* Washington (state), United States
* Washington, D.C., the capital of the United States
** A metonym for the federal government of the United States
** Washington metropolitan area, the metropolitan area centered on ...
''hot line'' established after the
Cuban Missile Crisis
The Cuban Missile Crisis, also known as the October Crisis (of 1962) ( es, Crisis de Octubre) in Cuba, the Caribbean Crisis () in Russia, or the Missile Scare, was a 35-day (16 October – 20 November 1962) confrontation between the United S ...
.
Substitution in modern cryptography
Substitution ciphers as discussed above, especially the older pencil-and-paper hand ciphers, are no longer in serious use. However, the cryptographic concept of substitution carries on even today. From a sufficiently abstract perspective, modern bit-oriented
block cipher
In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits, called ''blocks''. Block ciphers are specified cryptographic primitive, elementary components in the design of many cryptographic protocols and ...
s (e.g.,
DES
Des is a masculine given name, mostly a short form (hypocorism) of Desmond. People named Des include:
People
* Des Buckingham, English football manager
* Des Corcoran, (1928–2004), Australian politician
* Des Dillon (disambiguation), sever ...
, or
AES) can be viewed as substitution ciphers on an enormously large
binary
Binary may refer to:
Science and technology Mathematics
* Binary number, a representation of numbers using only two digits (0 and 1)
* Binary function, a function that takes two arguments
* Binary operation, a mathematical operation that t ...
alphabet. In addition, block ciphers often include smaller substitution tables called
S-boxes. See also
substitution–permutation network
In cryptography, an SP-network, or substitution–permutation network (SPN), is a series of linked mathematical operations used in block cipher algorithms such as AES (Rijndael), 3-Way, Kalyna, Kuznyechik, PRESENT, SAFER, SHARK, and Square.
S ...
.
Substitution ciphers in popular culture
*
Sherlock Holmes
Sherlock Holmes () is a fictional detective created by British author Arthur Conan Doyle. Referring to himself as a " consulting detective" in the stories, Holmes is known for his proficiency with observation, deduction, forensic science and ...
breaks a substitution cipher in "
The Adventure of the Dancing Men". There, the cipher remained undeciphered for years if not decades; not due to its difficulty, but because
no one suspected it to be a code, instead considering it childish scribblings.
*The
Al Bhed
Spira is the fictional world of the Square (video game company), Square role-playing video games ''Final Fantasy X'' and ''Final Fantasy X-2''. Spira is the first ''Final Fantasy'' world to feature consistent, all-encompassing spiritual and my ...
language in ''
Final Fantasy X
is a role-playing video game developed and published by Square as the tenth main entry in the ''Final Fantasy'' series. Originally released in 2001 for PlayStation 2, the game was re-released as ''Final Fantasy X/X-2 HD Remaster'' for PlayStat ...
'' is actually a substitution cipher, although it is pronounced phonetically (i.e. "you" in English is translated to "oui" in Al Bhed, but is pronounced the same way that "oui" is pronounced in
French
French (french: français(e), link=no) may refer to:
* Something of, from, or related to France
** French language, which originated in France, and its various dialects and accents
** French people, a nation and ethnic group identified with Franc ...
).
*The
Minbari's alphabet from the ''
Babylon 5'' series is a substitution cipher from English.
*The language in ''
Starfox Adventures: Dinosaur Planet'' spoken by native Saurians and
Krystal
Krystal may refer to:
People
* Krystal Ann Simpson (born 1982), American poet, fashion blogger, DJ, reality television personality, and musician
* Krystal Ball (born 1981), American political commentator
* Krystal Barter, Australian activi ...
is also a substitution cipher of the
English alphabet
The alphabet for Modern English is a Latin-script alphabet consisting of 26 letters, each having an upper- and lower-case form. The word ''alphabet'' is a compound of the first two letters of the Greek alphabet, '' alpha'' and '' beta''. ...
.
*The television program ''
Futurama
''Futurama'' is an American animated science fiction sitcom created by Matt Groening for the Fox Broadcasting Company. The series follows the adventures of the professional slacker Philip J. Fry, who is cryogenically preserved for 1000 years a ...
'' contained a substitution cipher in which all 26 letters were replaced by symbols and calle
"Alien Language" This was deciphered rather quickly by the die hard viewers by showing a "Slurm" ad with the word "Drink" in both plain English and the Alien language thus giving the key. Later, the producers created a second alien language that used a combination of replacement and mathematical Ciphers. Once the English letter of the alien language is deciphered, then the numerical value of that letter (0 for "A" through 25 for "Z" respectively) is then added (modulo 26) to the value of the previous letter showing the actual intended letter. These messages can be seen throughout every episode of the series and the subsequent movies.
*At the end of every season 1 episode of the cartoon series ''
Gravity Falls'', during the credit roll, there is one of three simple substitution ciphers: A -3
Caesar cipher (hinted by "3 letters back" at the end of the opening sequence), an
Atbash cipher, or a letter-to-number simple substitution cipher. The season 1 finale encodes a message with all three. In the second season,
Vigenère ciphers are used in place of the various monoalphabetic ciphers, each using a key hidden within its episode.
*In the
Artemis Fowl series by
Eoin Colfer
Eoin Colfer (; born 14 May 1965) is an Irish author of children's books. He worked as a primary school teacher before he became a full-time writer. He is best known for being the author of the Artemis Fowl (series), ''Artemis Fowl'' series. I ...
there are three substitution ciphers; Gnommish, Centaurean and Eternean, which run along the bottom of the pages or are somewhere else within the books.
*In ''Bitterblue'', the third novel by
Kristin Cashore
Kristin Cashore (born 1976) is an American young adult and fantasy writer, best known for the Graceling Realm series.
Early life
Cashore grew up in the Pennsylvania countryside, the second of four daughters. She has a bachelor's degree from Wil ...
, substitution ciphers serve as an important form of coded communication.
*In the 2013 video game ''
BioShock Infinite'', there are substitution ciphers hidden throughout the game in which the player must find code books to help decipher them and gain access to a surplus of supplies.
*In the anime adaptation of ''
The Devil Is a Part-Timer!'', the language of Ente Isla, called Entean, uses a substitution cipher with the ciphertext alphabet , leaving only A, E, I, O, U, L, N, and Q in their original positions.
See also
*
Ban (unit) with Centiban Table
*
Copiale cipher
The Copiale cipher is an encrypted manuscript consisting of 75,000 handwritten characters filling 105 pages in a bound volume. Undeciphered for more than 260 years, the document was cracked in 2011 with the help of modern computer techniques. An ...
*
Leet
Leet (or "1337"), also known as eleet or leetspeak, is a system of modified spellings used primarily on the Internet. It often uses character replacements in ways that play on the similarity of their glyphs via reflection or other resemblance. ...
*
Vigenère cipher
*
Topics in cryptography
References
External links
quipqiupAn automated tool for solving simple substitution ciphers both with and without known word boundaries.
CrypToolExhaustive free and open-source e-learning tool to perform and break substitution ciphers and many more.
Substitution Cipher ToolkitApplication that can - amongst other things - decrypt texts encrypted with substitution cipher ''automatically''
A monoalphabetic cipher cracker.
Monoalphabetic Cipher Implementation for Encrypting File(C Language).
Substitution cipher implementation with Caesar and Atbash ciphers(Java)
(Flash)
Online simple substitution implementation for MAKEPROFIT code(CGI script: Set input in URL, read output in web page)
Breaking A Monoalphabetic Encryption System Using a Known Plaintext Attack
* http://cryptoclub.math.uic.edu/substitutioncipher/sub2.htm
Daily Cryptoquip (Substitution) Cipher Answer
{{Cryptography navbox , classical
Classical ciphers
*
Cryptography
History of cryptography