Substitution Alphabet
   HOME

TheInfoList



OR:

In
cryptography Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...
, a substitution cipher is a method of
encrypting In cryptography, encryption (more specifically, encoding) is the process of transforming information in a way that, ideally, only authorized parties can decode. This process converts the original representation of the information, known as plain ...
in which units of
plaintext In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted. Overview With the advent of comp ...
are replaced with the
ciphertext In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext ...
, in a defined manner, with the help of a key; the "units" may be single letters (the most common), pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing the inverse substitution process to extract the original message. Substitution ciphers can be compared with
transposition cipher In cryptography, a transposition cipher (also known as a permutation cipher) is a method of encryption which scrambles the positions of characters (''transposition'') without changing the characters themselves. Transposition ciphers reorder units ...
s. In a transposition cipher, the units of the plaintext are rearranged in a different and usually quite complex order, but the units themselves are left unchanged. By contrast, in a substitution cipher, the units of the plaintext are retained in the same sequence in the ciphertext, but the units themselves are altered. There are a number of different types of substitution cipher. If the cipher operates on single letters, it is termed a simple substitution cipher; a cipher that operates on larger groups of letters is termed polygraphic. A monoalphabetic cipher uses fixed substitution over the entire message, whereas a
polyalphabetic cipher A polyalphabetic cipher is a substitution cipher, substitution, using multiple substitution alphabets. The Vigenère cipher is probably the best-known example of a polyalphabetic cipher, though it is a simplified special case. The Enigma machine i ...
uses a number of substitutions at different positions in the message, where a unit from the plaintext is mapped to one of several possibilities in the ciphertext and vice versa. The first ever published description of how to crack simple substitution ciphers was given by
Al-Kindi Abū Yūsuf Yaʻqūb ibn ʼIsḥāq aṣ-Ṣabbāḥ al-Kindī (; ; ; ) was an Arab Muslim polymath active as a philosopher, mathematician, physician, and music theorist Music theory is the study of theoretical frameworks for understandin ...
in ''A Manuscript on Deciphering Cryptographic Messages'' written around 850 AD. The method he described is now known as
frequency analysis In cryptanalysis, frequency analysis (also known as counting letters) is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers. Frequency analysis is based on th ...
. __TOC__


Types


Simple

The simplest substitution ciphers are the
Caesar cipher In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code, or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in t ...
and Atbash cipher. Here single letters are substituted (referred to as simple substitution). It can be demonstrated by writing out the alphabet twice, once in regular order and again with the letters shifted by some number of steps or reversed to represent the ''ciphertext alphabet'' (or substitution alphabet). The substitution alphabet could also be scrambled in a more complex fashion, in which case it is called a mixed alphabet or ''deranged alphabet''. Traditionally, mixed alphabets may be created by first writing out a keyword, removing repeated letters in it, then writing all the remaining letters in the alphabet in the usual order. Using this system, the keyword "" gives us the following alphabets: A message flee at once. we are discovered! enciphers to SIAA ZQ LKBA. VA ZOA RFPBLUAOAR! And the keyword "" gives us the following alphabets: The same message flee at once. we are discovered! enciphers to MCDD GS JIAD. WD GPD NHQAJVDPDN! Usually the ciphertext is written out in blocks of fixed length, omitting punctuation and spaces; this is done to disguise word boundaries from the
plaintext In cryptography, plaintext usually means unencrypted information pending input into cryptographic algorithms, usually encryption algorithms. This usually refers to data that is transmitted or stored unencrypted. Overview With the advent of comp ...
and to help avoid transmission errors. These blocks are called "groups", and sometimes a "group count" (i.e. the number of groups) is given as an additional check. Five-letter groups are often used, dating from when messages used to be transmitted by
telegraph Telegraphy is the long-distance transmission of messages where the sender uses symbolic codes, known to the recipient, rather than a physical exchange of an object bearing the message. Thus flag semaphore is a method of telegraphy, whereas ...
: SIAAZ QLKBA VAZOA RFPBL UAOAR If the length of the message happens not to be divisible by five, it may be padded at the end with "
null Null may refer to: Science, technology, and mathematics Astronomy *Nuller, an optical tool using interferometry to block certain sources of light Computing *Null (SQL) (or NULL), a special marker and keyword in SQL indicating that a data value do ...
s". These can be any characters that decrypt to obvious nonsense, so that the receiver can easily spot them and discard them. The ciphertext alphabet is sometimes different from the plaintext alphabet; for example, in the pigpen cipher, the ciphertext consists of a set of symbols derived from a grid. For example: Such features make little difference to the security of a scheme, however – at the very least, any set of strange symbols can be transcribed back into an A-Z alphabet and dealt with as normal. In lists and catalogues for salespeople, a very simple encryption is sometimes used to replace numeric digits by letters. Examples: MAT would be used to represent 120, PAPR would be used for 5256, and OFTK would be used for 7803.


Security

Although the traditional keyword method for creating a mixed substitution alphabet is simple, a serious disadvantage is that the last letters of the alphabet (which are mostly low frequency) tend to stay at the end. A stronger way of constructing a mixed alphabet is to generate the substitution alphabet completely randomly. Although the number of possible substitution alphabets is very large (26! ≈ 288.4, or about 88 bits), this cipher is not very strong, and is easily broken. Provided the message is of reasonable length (see below), the
cryptanalyst Cryptanalysis (from the Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic se ...
can deduce the probable meaning of the most common symbols by analyzing the
frequency distribution In statistics, the frequency or absolute frequency of an Event (probability theory), event i is the number n_i of times the observation has occurred/been recorded in an experiment or study. These frequencies are often depicted graphically or tabu ...
of the ciphertext. This allows formation of partial words, which can be tentatively filled in, progressively expanding the (partial) solution (see
frequency analysis In cryptanalysis, frequency analysis (also known as counting letters) is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers. Frequency analysis is based on th ...
for a demonstration of this). In some cases, underlying words can also be determined from the pattern of their letters; for example, the English words ''tater'', ''ninth'', and ''paper'' all have the pattern ''ABACD''. Many people solve such ciphers for recreation, as with
cryptogram A cryptogram is a type of puzzle that consists of a short piece of encrypted text. Generally the cipher used to encrypt the text is simple enough that the cryptogram can be solved by hand. Substitution ciphers where each letter is replaced by ...
puzzles in the newspaper. According to the
unicity distance In cryptography, unicity distance is the length of an original ciphertext needed to break the cipher by reducing the number of possible spurious keys to zero in a brute force attack. That is, after trying every possible key, there should be just ...
of English, 27.6 letters of ciphertext are required to crack a mixed alphabet simple substitution. In practice, typically about 50 letters are needed, although some messages can be broken with fewer if unusual patterns are found. In other cases, the plaintext can be contrived to have a nearly flat frequency distribution, and much longer plaintexts will then be required by the cryptanalyst.


Nomenclator

One once-common variant of the substitution cipher is the nomenclator. Named after the public official who announced the titles of visiting dignitaries, this
cipher In cryptography, a cipher (or cypher) is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is ''encipherment''. To encipher or encode i ...
uses a small
code In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communicati ...
sheet containing letter, syllable and word substitution tables, sometimes homophonic, that typically converted symbols into numbers. Originally the code portion was restricted to the names of important people, hence the name of the cipher; in later years, it covered many common words and place names as well. The symbols for whole words ('' codewords'' in modern parlance) and letters (''cipher'' in modern parlance) were not distinguished in the ciphertext. The Rossignols' Great Cipher used by
Louis XIV of France LouisXIV (Louis-Dieudonné; 5 September 16381 September 1715), also known as Louis the Great () or the Sun King (), was King of France from 1643 until his death in 1715. His verified reign of 72 years and 110 days is the List of longest-reign ...
was one. Nomenclators were the standard fare of diplomatic correspondence,
espionage Espionage, spying, or intelligence gathering, as a subfield of the intelligence field, is the act of obtaining secret or confidential information ( intelligence). A person who commits espionage on a mission-specific contract is called an ...
, and advanced political
conspiracy A conspiracy, also known as a plot, ploy, or scheme, is a secret plan or agreement between people (called conspirers or conspirators) for an unlawful or harmful purpose, such as murder, treason, or corruption, especially with a political motivat ...
from the early fifteenth century to the late eighteenth century; most conspirators were and have remained less cryptographically sophisticated. Although
government A government is the system or group of people governing an organized community, generally a State (polity), state. In the case of its broad associative definition, government normally consists of legislature, executive (government), execu ...
intelligence Intelligence has been defined in many ways: the capacity for abstraction, logic, understanding, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, critical thinking, and problem-solving. It can be described as t ...
cryptanalyst Cryptanalysis (from the Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic se ...
s were systematically breaking nomenclators by the mid-sixteenth century, and superior systems had been available since 1467, the usual response to
cryptanalysis Cryptanalysis (from the Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic se ...
was simply to make the tables larger. By the late eighteenth century, when the system was beginning to die out, some nomenclators had 50,000 symbols. Nevertheless, not all nomenclators were broken; today, cryptanalysis of archived ciphertexts remains a fruitful area of
historical research Historical method is the collection of techniques and guidelines that historians use to research and write histories of the past. Secondary sources, primary sources and material evidence such as that derived from archaeology may all be draw ...
.


Homophonic

An early attempt to increase the difficulty of frequency analysis attacks on substitution ciphers was to disguise plaintext letter frequencies by
homophony In music, homophony (;, Greek: ὁμόφωνος, ''homóphōnos'', from ὁμός, ''homós'', "same" and φωνή, ''phōnē'', "sound, tone") is a texture in which a primary part is supported by one or more additional strands that provide ...
. In these ciphers, plaintext letters map to more than one ciphertext symbol. Usually, the highest-frequency plaintext symbols are given more equivalents than lower frequency letters. In this way, the frequency distribution is flattened, making analysis more difficult. Since more than 26 characters will be required in the ciphertext alphabet, various solutions are employed to invent larger alphabets. Perhaps the simplest is to use a numeric substitution 'alphabet'. Another method consists of simple variations on the existing alphabet; uppercase, lowercase, upside down, etc. More artistically, though not necessarily more securely, some homophonic ciphers employed wholly invented alphabets of fanciful symbols. The
book cipher A book cipher is a cipher in which each word or letter in the plaintext of a message is replaced by some code that locates it in another text, the key. A simple version of such a cipher would use a specific book as the key, and would replace ea ...
is a type of homophonic cipher, one example being the Beale ciphers. This is a story of buried treasure that was described in 1819–21 by use of a ciphered text that was keyed to the Declaration of Independence. Here each ciphertext character was represented by a number. The number was determined by taking the plaintext character and finding a word in the Declaration of Independence that started with that character and using the numerical position of that word in the Declaration of Independence as the encrypted form of that letter. Since many words in the Declaration of Independence start with the same letter, the encryption of that character could be any of the numbers associated with the words in the Declaration of Independence that start with that letter. Deciphering the encrypted text character ''X'' (which is a number) is as simple as looking up the Xth word of the Declaration of Independence and using the first letter of that word as the decrypted character. Another homophonic cipher was described by Stahl and was one of the first attempts to provide for computer security of data systems in computers through encryption. Stahl constructed the cipher in such a way that the number of homophones for a given character was in proportion to the frequency of the character, thus making frequency analysis much more difficult.
Francesco I Gonzaga image:Ritratto di Francesco I Gonzaga.jpg, Portrait of Francesco I Gonzaga Francesco I Gonzaga (1366 – 7 March 1407) was List of rulers of Mantua, ruler of Mantua from 1382 to 1407. He was also a condottiero. Diplomatic policies towards Mil ...
,
Duke of Mantua During its Timeline of Mantua, history as independent entity, Mantua had different rulers who governed on the city and the lands of Mantua from the Middle Ages to the early modern period. From 970 to 1115, the Counts of Mantua were members of ...
, used the earliest known example of a homophonic substitution cipher in 1401 for correspondence with one Simone de Crema.
Mary, Queen of Scots Mary, Queen of Scots (8 December 1542 – 8 February 1587), also known as Mary Stuart or Mary I of Scotland, was List of Scottish monarchs, Queen of Scotland from 14 December 1542 until her forced abdication in 1567. The only surviving legit ...
, while imprisoned by Elizabeth I, during the years from 1578 to 1584 used homophonic ciphers with additional encryption using a nomenclator for frequent prefixes, suffixes, and proper names while communicating with her allies including
Michel de Castelnau Michel de Castelnau, Sieur de la Mauvissière ( 1520–1592) was a French soldier and diplomat, ambassador to Elizabeth I, Queen Elizabeth I. He wrote a memoir covering the period between 1559 and 1570. Life He was born in La Mauvissière (now pa ...
.


Polyalphabetic

The work of
Al-Qalqashandi Shihāb al-Dīn Abū 'l-Abbās Aḥmad ibn ‘Alī ibn Aḥmad ‘Abd Allāh al-Fazārī al-Shāfiʿī better known by the epithet al-Qalqashandī (; 1355 or 1356 – 1418), was a medieval Arab Egyptian encyclopedist, polymath and mathemati ...
(1355-1418), based on the earlier work of
Ibn al-Durayhim ʿAlī ibn Muḥammad ibn ʿAbd al-ʿAzīz Ibn Futūḥ ibn Ibrahīm ibn Abū Bakr (; 1312–1359/62 CE), known as Ibn Durayhim al-Mawsilī () was an Arab writer, mathematician, cryptologist and scribe. Cryptology Ibn al-Durayhim gave detailed ...
(1312–1359), contained the first published discussion of the substitution and transposition of ciphers, as well as the first description of a polyalphabetic cipher, in which each plaintext letter is assigned more than one substitute. Polyalphabetic substitution ciphers were later described in 1467 by
Leone Battista Alberti Leon Battista Alberti (; 14 February 1404 – 25 April 1472) was an Italian Renaissance humanist author, artist, architect, poet, priest, linguist, philosopher, and cryptographer; he epitomised the nature of those identified now as polymaths. H ...
in the form of disks.
Johannes Trithemius Johannes Trithemius (; 1 February 1462 – 13 December 1516), born Johann Heidenberg, was a German Benedictine abbot and a polymath who was active in the German Renaissance as a Lexicography, lexicographer, chronicler, Cryptography, cryptograph ...
, in his book ''Steganographia'' (
Ancient Greek Ancient Greek (, ; ) includes the forms of the Greek language used in ancient Greece and the classical antiquity, ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Greek ...
for "hidden writing") introduced the now more standard form of a ''tableau'' (see below; ca. 1500 but not published until much later). A more sophisticated version using mixed alphabets was described in 1563 by Giovanni Battista della Porta in his book, ''
De Furtivis Literarum Notis ''De Furtivis Literarum Notis'' (''On the Secret Symbols of Letters'') is a 1563 book on cryptography written by Giambattista della Porta. The book includes three sets of cypher discs for coding and decoding messages, a substitution cipher impro ...
'' (
Latin Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
for "On concealed characters in writing"). In a polyalphabetic cipher, multiple cipher alphabets are used. To facilitate encryption, all the alphabets are usually written out in a large
table Table may refer to: * Table (database), how the table data arrangement is used within the databases * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (information), a data arrangement with rows and column ...
, traditionally called a ''tableau''. The tableau is usually 26×26, so that 26 full ciphertext alphabets are available. The method of filling the tableau, and of choosing which alphabet to use next, defines the particular polyalphabetic cipher. All such ciphers are easier to break than once believed, as substitution alphabets are repeated for sufficiently large plaintexts. One of the most popular was that of
Blaise de Vigenère Blaise de Vigenère (5 April 1523 – 19 February 1596) () was a French diplomat, cryptographer, translator and alchemist. Biography Vigenère was born into a respectable family in the village of Saint-Pourçain in Bourbonnais. When he w ...
. First published in 1585, it was considered unbreakable until 1863, and indeed was commonly called ''le chiffre indéchiffrable'' ( French for "indecipherable cipher"). In the
Vigenère cipher The Vigenère cipher () is a method of encryption, encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key (crypt ...
, the first row of the tableau is filled out with a copy of the plaintext alphabet, and successive rows are simply shifted one place to the left. (Such a simple tableau is called a ''
tabula recta In cryptography, the ''tabula recta'' (from Latin language, Latin ''wikt:tabula#Latin, tabula wikt:rectus#Latin, rēcta'') is a square table of alphabets, each row of which is made by shifting the previous one to the left. The term was invented ...
'', and mathematically corresponds to adding the plaintext and key letters,
modulo In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation. Given two positive numbers and , mo ...
26.) A keyword is then used to choose which ciphertext alphabet to use. Each letter of the keyword is used in turn, and then they are repeated again from the beginning. So if the keyword is 'CAT', the first letter of plaintext is enciphered under alphabet 'C', the second under 'A', the third under 'T', the fourth under 'C' again, and so on, or if the keyword is 'RISE', the first letter of plaintext is enciphered under alphabet 'R', the second under 'I', the third under 'S', the fourth under 'E', and so on. In practice, Vigenère keys were often phrases several words long. In 1863,
Friedrich Kasiski Major Friedrich Wilhelm Kasiski (29 November 1805 – 22 May 1881) was a German infantry officer, cryptographer and archeologist. Kasiski was born in Schlochau, Kingdom of Prussia (now Człuchów, Poland). Military service Kasiski enlisted in E ...
published a method (probably discovered secretly and independently before the
Crimean War The Crimean War was fought between the Russian Empire and an alliance of the Ottoman Empire, the Second French Empire, the United Kingdom of Great Britain and Ireland, and the Kingdom of Sardinia (1720–1861), Kingdom of Sardinia-Piedmont fro ...
by
Charles Babbage Charles Babbage (; 26 December 1791 – 18 October 1871) was an English polymath. A mathematician, philosopher, inventor and mechanical engineer, Babbage originated the concept of a digital programmable computer. Babbage is considered ...
) which enabled the calculation of the length of the keyword in a Vigenère ciphered message. Once this was done, ciphertext letters that had been enciphered under the same alphabet could be picked out and attacked separately as a number of semi-independent simple substitutions - complicated by the fact that within one alphabet letters were separated and did not form complete words, but simplified by the fact that usually a ''tabula recta'' had been employed. As such, even today a Vigenère type cipher should theoretically be difficult to break if mixed alphabets are used in the tableau, if the keyword is random, and if the total length of ciphertext is less than 27.67 times the length of the keyword. These requirements are rarely understood in practice, and so Vigenère enciphered message security is usually less than might have been. Other notable polyalphabetics include: * The Gronsfeld cipher. This is identical to the Vigenère except that only 10 alphabets are used, and so the "keyword" is numerical. * The Beaufort cipher. This is practically the same as the Vigenère, except the ''tabula recta'' is replaced by a backwards one, mathematically equivalent to ciphertext = key - plaintext. This operation is ''self-inverse'', whereby the same table is used for both encryption and decryption. * The
autokey cipher An autokey cipher (also known as the autoclave cipher) is a cipher that incorporates the message (the plaintext) into the key. The key is generated from the message in some automated fashion, sometimes by selecting certain letters from the text o ...
, which mixes plaintext with a key to avoid
periodic Periodicity or periodic may refer to: Mathematics * Bott periodicity theorem, addresses Bott periodicity: a modulo-8 recurrence relation in the homotopy groups of classical groups * Periodic function, a function whose output contains values tha ...
ity. * The running key cipher, where the key is made very long by using a passage from a book or similar text. Modern
stream cipher stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream ( keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystrea ...
s can also be seen, from a sufficiently abstract perspective, to be a form of polyalphabetic cipher in which all the effort has gone into making the
keystream In cryptography, a keystream is a stream of random or pseudorandom characters that are combined with a plaintext message to produce an encrypted message (the ciphertext). The "characters" in the keystream can be bit The bit is the most basic ...
as long and unpredictable as possible.


Polygraphic

In a polygraphic substitution cipher, plaintext letters are substituted in larger groups, instead of substituting letters individually. The first advantage is that the frequency distribution is much flatter than that of individual letters (though not actually flat in real languages; for example, 'OS' is much more common than 'RÑ' in Spanish). Second, the larger number of symbols requires correspondingly more ciphertext to productively analyze letter frequencies. To substitute ''pairs'' of letters would take a substitution alphabet 676 symbols long (26^2). In the same ''De Furtivis Literarum Notis'' mentioned above, della Porta actually proposed such a system, with a 20 x 20 tableau (for the 20 letters of the Italian/Latin alphabet he was using) filled with 400 unique
glyph A glyph ( ) is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A ...
s. However the system was impractical and probably never actually used. The earliest practical digraphic cipher (pairwise substitution), was the so-called Playfair cipher, invented by Sir
Charles Wheatstone Sir Charles Wheatstone (; 6 February 1802 – 19 October 1875) was an English physicist and inventor best known for his contributions to the development of the Wheatstone bridge, originally invented by Samuel Hunter Christie, which is used to m ...
in 1854. In this cipher, a 5 x 5 grid is filled with the letters of a mixed alphabet (two letters, usually I and J, are combined). A digraphic substitution is then simulated by taking pairs of letters as two corners of a rectangle, and using the other two corners as the ciphertext (see the Playfair cipher main article for a diagram). Special rules handle double letters and pairs falling in the same row or column. Playfair was in military use from the
Boer War The Second Boer War (, , 11 October 189931 May 1902), also known as the Boer War, Transvaal War, Anglo–Boer War, or South African War, was a conflict fought between the British Empire and the two Boer republics (the South African Republic an ...
through
World War II World War II or the Second World War (1 September 1939 – 2 September 1945) was a World war, global conflict between two coalitions: the Allies of World War II, Allies and the Axis powers. World War II by country, Nearly all of the wo ...
. Several other practical polygraphics were introduced in 1901 by Felix Delastelle, including the
bifid Bifid refers to something that is split or cleft into two parts. It may refer to: * Bifid, a variation in the P wave, R wave, or T wave in an echocardiogram in which a wave which usually has a single peak instead has two separate peaks * Bifid ...
and
four-square cipher The four-square cipher is a manual symmetric encryption technique. It was invented by the French cryptographer Felix Delastelle. The technique encrypts pairs of letters (''digraphs''), and falls into a category of ciphers known as polygraphic ...
s (both digraphic) and the
trifid cipher The trifid cipher is a classical cipher invented by Félix Delastelle and described in 1902. Extending the principles of Delastelle's earlier bifid cipher, it combines the techniques of fractionation and transposition to achieve a certain amount ...
(probably the first practical trigraphic). The
Hill cipher In classical cryptography, the Hill cipher is a polygraphic substitution cipher based on linear algebra. Invented by Lester S. Hill in 1929, it was the first polygraphic cipher in which it was practical (though barely) to operate on more than t ...
, invented in 1929 by Lester S. Hill, is a polygraphic substitution which can combine much larger groups of letters simultaneously using
linear algebra Linear algebra is the branch of mathematics concerning linear equations such as :a_1x_1+\cdots +a_nx_n=b, linear maps such as :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrix (mathemat ...
. Each letter is treated as a digit in base 26: A = 0, B =1, and so on. (In a variation, 3 extra symbols are added to make the
basis Basis is a term used in mathematics, finance, science, and other contexts to refer to foundational concepts, valuation measures, or organizational names; here, it may refer to: Finance and accounting * Adjusted basis, the net cost of an asse ...
prime A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
.) A block of n letters is then considered as a
vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...
of n
dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coo ...
s, and multiplied by a n x n
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
,
modulo In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation. Given two positive numbers and , mo ...
26. The components of the matrix are the key, and should be
random In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. ...
provided that the matrix is invertible in \mathbb_^n (to ensure decryption is possible). A mechanical version of the Hill cipher of dimension 6 was patented in 1929. The Hill cipher is vulnerable to a
known-plaintext attack The known-plaintext attack (KPA) is an attack model for cryptanalysis where the attacker has access to both the plaintext (called a crib) and its encrypted version (ciphertext). These can be used to reveal secret keys and code books. The term " ...
because it is completely
linear In mathematics, the term ''linear'' is used in two distinct senses for two different properties: * linearity of a '' function'' (or '' mapping''); * linearity of a '' polynomial''. An example of a linear function is the function defined by f(x) ...
, so it must be combined with some
non-linear In mathematics and science, a nonlinear system (or a non-linear system) is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathe ...
step to defeat this attack. The combination of wider and wider weak, linear diffusive steps like a Hill cipher, with non-linear substitution steps, ultimately leads to a
substitution–permutation network In cryptography, an SP-network, or substitution–permutation network (SPN), is a series of linked mathematical operations used in block cipher algorithms such as AES (Rijndael), 3-Way, Kalyna, Kuznyechik, PRESENT, SAFER, SHARK, and Square. ...
(e.g. a
Feistel cipher In cryptography, a Feistel cipher (also known as Luby–Rackoff block cipher) is a symmetric structure used in the construction of block ciphers, named after the German-born physicist and cryptographer Horst Feistel, who did pioneering resear ...
), so it is possible – from this extreme perspective – to consider modern
block cipher In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called ''blocks''. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage a ...
s as a type of polygraphic substitution.


Mechanical

Between around
World War I World War I or the First World War (28 July 1914 – 11 November 1918), also known as the Great War, was a World war, global conflict between two coalitions: the Allies of World War I, Allies (or Entente) and the Central Powers. Fighting to ...
and the widespread availability of
computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
s (for some governments this was approximately the 1950s or 1960s; for other organizations it was a decade or more later; for individuals it was no earlier than 1975), mechanical implementations of polyalphabetic substitution ciphers were widely used. Several inventors had similar ideas about the same time, and rotor cipher machines were patented four times in 1919. The most important of the resulting machines was the Enigma, especially in the versions used by the German military from approximately 1930. The
Allies An alliance is a relationship among people, groups, or states that have joined together for mutual benefit or to achieve some common purpose, whether or not an explicit agreement has been worked out among them. Members of an alliance are calle ...
also developed and used rotor machines (e.g.,
SIGABA In the history of cryptography, the ECM Mark II was a cipher machine used by the United States for message encryption from World War II until the 1950s. The machine was also known as the SIGABA or Converter M-134 by the Army, or CSP-888/889 by th ...
and Typex). All of these were similar in that the substituted letter was chosen
electric Electricity is the set of physical phenomena associated with the presence and motion of matter possessing an electric charge. Electricity is related to magnetism, both being part of the phenomenon of electromagnetism, as described by Maxwel ...
ally from amongst the huge number of possible combinations resulting from the rotation of several letter disks. Since one or more of the disks rotated mechanically with each plaintext letter enciphered, the number of alphabets used was astronomical. Early versions of these machine were, nevertheless, breakable. William F. Friedman of the US Army's
SIS Sis or SIS may refer to: People *Michael Sis (born 1960), American Catholic bishop Places * Sis (ancient city), historical town in modern-day Turkey, served as the capital of the Armenian Kingdom of Cilicia. * Kozan, Adana, the current name ...
early found vulnerabilities in Hebern's rotor machine, and the
Government Code and Cypher School The Government Code and Cypher School (GC&CS) was a British signals intelligence agency set up in 1919. During the First World War, the British Army and Royal Navy had separate signals intelligence agencies, MI1b and NID25 (initially known as R ...
's Dillwyn Knox solved versions of the Enigma machine (those without the "plugboard") well before
WWII World War II or the Second World War (1 September 1939 – 2 September 1945) was a World war, global conflict between two coalitions: the Allies of World War II, Allies and the Axis powers. World War II by country, Nearly all of the wo ...
began. Traffic protected by essentially all of the German military Enigmas was broken by Allied cryptanalysts, most notably those at
Bletchley Park Bletchley Park is an English country house and Bletchley Park estate, estate in Bletchley, Milton Keynes (Buckinghamshire), that became the principal centre of Allies of World War II, Allied World War II cryptography, code-breaking during the S ...
, beginning with the German Army variant used in the early 1930s. This version was broken by inspired mathematical insight by
Marian Rejewski Marian Adam Rejewski (; 16 August 1905 – 13 February 1980) was a Polish people, Polish mathematician and Cryptography, cryptologist who in late 1932 reconstructed the sight-unseen German military Enigma machine, Enigma cipher machine, aided ...
in
Poland Poland, officially the Republic of Poland, is a country in Central Europe. It extends from the Baltic Sea in the north to the Sudetes and Carpathian Mountains in the south, bordered by Lithuania and Russia to the northeast, Belarus and Ukrai ...
. As far as is publicly known, no messages protected by the
SIGABA In the history of cryptography, the ECM Mark II was a cipher machine used by the United States for message encryption from World War II until the 1950s. The machine was also known as the SIGABA or Converter M-134 by the Army, or CSP-888/889 by th ...
and Typex machines were ever broken during or near the time when these systems were in service.


One-time pad

One type of substitution cipher, the
one-time pad The one-time pad (OTP) is an encryption technique that cannot be Cryptanalysis, cracked in cryptography. It requires the use of a single-use pre-shared key that is larger than or equal to the size of the message being sent. In this technique, ...
, is unique. It was invented near the end of World War I by
Gilbert Vernam Gilbert Sandford Vernam (April 3, 1890 – February 7, 1960) was a Worcester Polytechnic Institute 1914 graduate and AT&T Bell Labs engineer who, in 1917, invented an additive polyalphabetic stream cipher and later co-invented an automated ...
and Joseph Mauborgne in the US. It was mathematically proven unbreakable by
Claude Shannon Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, computer scientist, cryptographer and inventor known as the "father of information theory" and the man who laid the foundations of th ...
, probably during
World War II World War II or the Second World War (1 September 1939 – 2 September 1945) was a World war, global conflict between two coalitions: the Allies of World War II, Allies and the Axis powers. World War II by country, Nearly all of the wo ...
; his work was first published in the late 1940s. In its most common implementation, the one-time pad can be called a substitution cipher only from an unusual perspective; typically, the plaintext letter is combined (not substituted) in some manner (e.g.,
XOR Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ (one ...
) with the key material character at that position. The one-time pad is, in most cases, impractical as it requires that the key material be as long as the plaintext, ''actually''
random In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. ...
, used once and ''only'' once, and kept entirely secret from all except the sender and intended receiver. When these conditions are violated, even marginally, the one-time pad is no longer unbreakable.
Soviet The Union of Soviet Socialist Republics. (USSR), commonly known as the Soviet Union, was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 until Dissolution of the Soviet ...
one-time pad messages sent from the US for a brief time during World War II used
non-random In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Ind ...
key material. US cryptanalysts, beginning in the late 40s, were able to, entirely or partially, break a few thousand messages out of several hundred thousand. (See
Venona project The Venona project was a United States counterintelligence program initiated during World War II by the United States Army's Signal Intelligence Service and later absorbed by the National Security Agency (NSA), that ran from February 1, 1943, u ...
) In a mechanical implementation, rather like the
Rockex Rockex, or Telekrypton, was an offline one-time tape Vernam cipher machine known to have been used by Britain and Canada from 1943. It was developed by Canadian electrical engineer Benjamin deForest Bayly, working during the war for British Se ...
equipment, the one-time pad was used for messages sent on the
Moscow Moscow is the Capital city, capital and List of cities and towns in Russia by population, largest city of Russia, standing on the Moskva (river), Moskva River in Central Russia. It has a population estimated at over 13 million residents with ...
-
Washington Washington most commonly refers to: * George Washington (1732–1799), the first president of the United States * Washington (state), a state in the Pacific Northwest of the United States * Washington, D.C., the capital of the United States ** A ...
''hot line'' established after the
Cuban Missile Crisis The Cuban Missile Crisis, also known as the October Crisis () in Cuba, or the Caribbean Crisis (), was a 13-day confrontation between the governments of the United States and the Soviet Union, when American deployments of Nuclear weapons d ...
.


In modern cryptography

Substitution ciphers as discussed above, especially the older pencil-and-paper hand ciphers, are no longer in serious use. However, the cryptographic concept of substitution carries on even today. From an abstract perspective, modern bit-oriented
block cipher In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called ''blocks''. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage a ...
s (e.g., DES, or AES) can be viewed as substitution ciphers on a large
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two values (0 and 1) for each digit * Binary function, a function that takes two arguments * Binary operation, a mathematical op ...
alphabet. In addition, block ciphers often include smaller substitution tables called
S-box In cryptography, an S-box (substitution-box) is a basic component of symmetric key algorithms which performs substitution. In block ciphers, they are typically used to obscure the relationship between the key and the ciphertext, thus ensuring Clau ...
es. See also
substitution–permutation network In cryptography, an SP-network, or substitution–permutation network (SPN), is a series of linked mathematical operations used in block cipher algorithms such as AES (Rijndael), 3-Way, Kalyna, Kuznyechik, PRESENT, SAFER, SHARK, and Square. ...
.


In popular culture

*
Sherlock Holmes Sherlock Holmes () is a Detective fiction, fictional detective created by British author Arthur Conan Doyle. Referring to himself as a "Private investigator, consulting detective" in his stories, Holmes is known for his proficiency with obser ...
breaks a substitution cipher in " The Adventure of the Dancing Men". There, the cipher remained undeciphered for years if not decades; not due to its difficulty, but because no one suspected it to be a code, instead considering it childish scribblings. * The Standard Galactic Alphabet, the writing system in the ''Commander Keen'' video games and in ''
Minecraft ''Minecraft'' is a 2011 sandbox game developed and published by the Swedish video game developer Mojang Studios. Originally created by Markus Persson, Markus "Notch" Persson using the Java (programming language), Java programming language, the ...
''. *The Al Bhed language in ''
Final Fantasy X is a 2001 role-playing video game developed and published by Square (video game company), Square for PlayStation 2. The tenth main installment in the ''Final Fantasy'' series, it is the first game in the series to feature fully 3D computer gra ...
'' is actually a substitution cipher, although it is pronounced phonetically (i.e. "you" in English is translated to "oui" in Al Bhed, but is pronounced the same way that "oui" is pronounced in French). *The
Minbari The list of ''Babylon 5'' characters contains characters from the entire ''Babylon 5'' universe. In the show, the Babylon station was conceived as a political and cultural meeting place. As such, one of the show's many themes is the cultural and ...
's alphabet from the ''
Babylon 5 ''Babylon 5'' is an American space opera television series created by writer and producer J. Michael Straczynski, under the Babylonian Productions label, in association with Straczynski's Synthetic Worlds Ltd. and Warner Bros. Domestic Tel ...
'' series is a substitution cipher from English. *The language in '' Starfox Adventures: Dinosaur Planet'' spoken by native Saurians and Krystal is also a substitution cipher of the
English alphabet Modern English is written with a Latin-script alphabet consisting of 26 Letter (alphabet), letters, with each having both uppercase and lowercase forms. The word ''alphabet'' is a Compound (linguistics), compound of ''alpha'' and ''beta'', t ...
. *The television program ''
Futurama ''Futurama'' is an American animated science fiction sitcom created by Matt Groening for the Fox Broadcasting Company and later revived by Comedy Central, and then Hulu. The series follows Philip J. Fry, who is cryogenically preserved for 1 ...
'' contained a substitution cipher in which all 26 letters were replaced by symbols and calle
"Alien Language"
. This was deciphered rather quickly by the die hard viewers by showing a "Slurm" ad with the word "Drink" in both plain English and the Alien language thus giving the key. Later, the producers created a second alien language that used a combination of replacement and mathematical Ciphers. Once the English letter of the alien language is deciphered, then the numerical value of that letter (0 for "A" through 25 for "Z" respectively) is then added (modulo 26) to the value of the previous letter showing the actual intended letter. These messages can be seen throughout every episode of the series and the subsequent movies. *At the end of every season 1 episode of the cartoon series ''
Gravity Falls ''Gravity Falls'' is an American Mystery fiction, mystery television comedy, comedy animated television series created by Alex Hirsch for Disney Channel and Disney XD. The series follows the adventures of Dipper Pines (Jason Ritter) and his twi ...
'', during the credit roll, there is one of three simple substitution ciphers: A -3
Caesar cipher In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code, or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in t ...
(hinted by "3 letters back" at the end of the opening sequence), an Atbash cipher, or a letter-to-number simple substitution cipher. The season 1 finale encodes a message with all three. In the second season,
Vigenère cipher The Vigenère cipher () is a method of encryption, encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key (crypt ...
s are used in place of the various monoalphabetic ciphers, each using a key hidden within its episode. *In the Artemis Fowl series by
Eoin Colfer Eoin Colfer (; born 14 May 1965) is an Irish author of children's books. He worked as a primary school teacher before he became a full-time writer. He is best known for being the author of the ''Artemis Fowl'' series. In September 2008, Colf ...
there are three substitution ciphers; Gnommish, Centaurean and Eternean, which run along the bottom of the pages or are somewhere else within the books. *In ''Bitterblue'', the third novel by Kristin Cashore, substitution ciphers serve as an important form of coded communication. *In the 2013 video game ''
BioShock Infinite ''BioShock Infinite'' is a first-person shooter video game developed by Irrational Games and published by 2K. The third installment of the ''BioShock'' series, ''Infinite'' was released worldwide for the PlayStation 3, Windows, Xbox 360, a ...
'', there are substitution ciphers hidden throughout the game in which the player must find code books to help decipher them and gain access to a surplus of supplies. *In the anime adaptation of ''
The Devil Is a Part-Timer! is a Japanese light novel series written by Satoshi Wagahara, with illustrations by Oniku (written as 029). ASCII Media Works has published the series in Japan, while Yen Press has published it in North America. The story follows Satan ...
'', the language of Ente Isla, called Entean, uses a substitution cipher with the ciphertext alphabet , leaving only A, E, I, O, U, L, N, and Q in their original positions.


See also

*
Ban (unit) The hartley (symbol Hart), also called a ban, or a dit (short for "decimal digit"), is a logarithmic unit that measures information or entropy, based on base 10 logarithms and powers of 10. One hartley is the information content of an event if ...
with Centiban Table * Copiale cipher * *
Leet Leet (or "1337"), also known as eleet or leetspeak, or simply hacker speech, is a system of modified spellings used primarily on the Internet. It often uses character replacements in ways that play on the similarity of their glyphs via refle ...
*
Vigenère cipher The Vigenère cipher () is a method of encryption, encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key (crypt ...
*
Topics in cryptography The following outline is provided as an overview of and topical guide to cryptography: Cryptography (or cryptology) – practice and study of hiding information. Modern cryptography intersects the disciplines of mathematics, computer scie ...
* Musical Substitution Ciphers


References


External links


Monoalphabetic Substitution
Breaking A Monoalphabetic Encryption System Using a Known Plaintext Attack {{Cryptography navbox , classical Classical ciphers * History of cryptography