Alphabetical order

TheInfoList

Alphabetical order is a system whereby
character string In computer programming Computer programming is the process of designing and building an executable computer program to accomplish a specific computing result or to perform a particular task. Programming involves tasks such as analysis, gene ...
s are placed in order based on the position of the characters in the conventional ordering of an
alphabet An alphabet is a standardized set of basic written symbols A symbol is a mark, sign, or word In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic ...

. It is one of the methods of
collation Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office fili ...
. In mathematics, a
lexicographical order In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
is the generalization of the alphabetical order to other data types, such as
sequences In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
of digits or numbers. When applied to strings or
sequences In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
that may contain digits, numbers or more elaborate types of elements, in addition to alphabetical characters, the alphabetical order is generally called a
lexicographical order In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
. To determine which of two strings of characters comes first when arranging in alphabetical order, their first
letters Letter, letters, or literature may refer to: Characters typeface * Letter (alphabet) A letter is a segmental symbol A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, Object (philosophy ...
are compared. If they differ, then the string whose first letter comes earlier in the alphabet comes before the other string. If the first letters are the same, then the second letters are compared, and so on. If a position is reached where one string has no more letters to compare while the other does, then the first (shorter) string is deemed to come first in alphabetical order.
Capital letter Letter case is the distinction between the letters Letter, letters, or literature may refer to: Characters typeface * Letter (alphabet) A letter is a segmental symbol A symbol is a mark, sign, or word that indicates, signifies, or i ...
s (upper case) are generally considered to be identical to their corresponding lower case letters for the purposes of alphabetical ordering, although conventions may be adopted to handle situations where two strings differ ''only'' in capitalization. Various conventions also exist for the handling of strings containing
space Space is the boundless extent in which and events have relative and . In , physical space is often conceived in three s, although modern s usually consider it, with , to be part of a boundless known as . The concept of space is considere ...
s, modified letters (such as those with
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter Letter, letters, or literature may refer to: Characters typeface * Letter (alphabet) A letter is a segmental symbol A sy ...
s), and non-letter characters such as marks of
punctuation Punctuation (or sometimes interpunction) is the use of spacing, conventional signs (called punctuation marks), and certain typographical devices as aids to the understanding and correct reading of written text, whether read silently or aloud. An ...
. The result of placing a set of words or strings in alphabetical order is that all of the strings beginning with the same letter are grouped together; within that grouping all words beginning with the same two-letter sequence are grouped together; and so on. The system thus tends to maximize the number of common initial letters between adjacent words.

# History

Alphabetical order was first used in the 1st millennium
BCE Common Era (CE) is one of the year notations used for the Gregorian calendar The Gregorian calendar is the used in most of the world. It was introduced in October 1582 by as a modification of the , reducing the average year from 365.2 ...

by Northwest Semitic scribes using the
Abjad An abjad () is a type of writing system in which (in contrast to true alphabets) each symbol or glyph stands for a consonant, in effect leaving it to readers to infer or otherwise supply an appropriate vowel. The term is a neologism introduced i ...

system. However, a range of other methods of classifying and ordering material, including geographical,
chronological 222px, Joseph Scaliger's ''De emendatione temporum'' (1583) began the modern science of chronology Chronology (from Latin Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European language ...
,
hierarchical A hierarchy (from the Greek: , from , 'president of sacred rites') is an arrangement of items (objects, names, values, categories, etc.) in which the items are represented as being "above", "below", or "at the same level as" one another. Hierarch ...

and , were preferred over alphabetical order for centuries. The
Bible The Bible (from Koine Greek Koine Greek (, , Greek approximately ;. , , , lit. "Common Greek"), also known as Alexandrian dialect, common Attic, Hellenistic or Biblical Greek, was the koiné language, common supra-regional form of Gree ...

is dated to the 6th–7th centuries BCE. In the Book of Jeremiah, the prophet utilizes an
Atbash Atbash ( he, אתבש; also transliterated Atbaš) is a monoalphabetic substitution cipher originally used to encrypt the Hebrew alphabet The Hebrew alphabet ( he, wikt:אלפבית, אָלֶף־בֵּית עִבְרִי, ), known variously ...

substitution cipher In cryptography Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in ...
, based on alphabetical order. Similarly, biblical authors used
acrostic An acrostic is a (or other form of writing) in which the first letter (or syllable, or word) of each line (or , or other recurring feature in the text) spells out a word, message or the alphabet. The word comes from the French ''acrostiche'' fro ...

s based on the (ordered)
Hebrew alphabet The Hebrew alphabet ( he, wikt:אלפבית, אָלֶף־בֵּית עִבְרִי, ), known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is an abjad script used in the writing of the Hebrew language ...

. The first effective use of alphabetical order as a cataloging device among scholars may have been in ancient Alexandria, in the
Great Library of Alexandria The Great Library of Alexandria in Alexandria, Egypt, was one of the largest and most significant libraries of the ancient world. The Library was part of a larger research institution called the Mouseion, which was dedicated to the Muse I ...
, which was founded around 300 BCE. The poet and scholar
Callimachus Callimachus (; grc-gre, Καλλίμαχος, ''Kallimakhos''; 310/305– 240 BC) was a native of the Greek colony of Cyrene, Libya Libya (; ar, ليبيا, Lībīyā), officially the State of Libya, ( ar, دولة ليبيا, Dawlat Līb ...
, who worked there, is thought to have created the world's first
library catalog A library catalog (or library catalogue in British English) is a register of all bibliography, bibliographic items found in a library or group of libraries, such as a network of libraries at several locations. A catalog for a group of librar ...
, known as the
Pinakes Imaginary depiction of the Library of Alexandria The ''Pinakes'' ( grc, Πίνακες "tables", plural of ) is a lost bibliography, bibliographic work composed by Callimachus (310/305–240 BCE) that is popularly considered to be the first libr ...
, with scrolls shelved in alphabetical order of the first letter of authors' names. In the 1st century BC, Roman writer
Varro Marcus Terentius Varro (; 116–27 BC) was one of ancient Rome In historiography, ancient Rome is Roman people, Roman civilization from the founding of the Italian city of Rome in the 8th century BC to the collapse of the Western Roman Empi ...
compiled alphabetic lists of authors and titles. In the 2nd century CE,
Sextus Pompeius FestusSextus Pompeius Festus, usually known simply as Festus, was a Ancient Rome, Roman grammarian who probably flourished in the later 2nd century AD, perhaps at Narbo (Narbonne) in Gaul. Work He made a 20-volume epitome of Verrius Flaccus's voluminous a ...
wrote an encyclopedic
epitome An epitome (; gr, ἐπιτομή, from ἐπιτέμνειν ''epitemnein'' meaning "to cut short") is a summary or miniature form, or an instance that represents a larger reality, also used as a synonym A synonym is a word, morpheme A m ...
of the works of
Verrius Flaccus Marcus Verrius Flaccus (c. 55 BCAD 20) was a Roman grammar In linguistics, the grammar (from Ancient Greek ''grammatikḗ'') of a natural language is its set of structure, structural constraints on speakers' or writers' composition of clause ...
, ''
De verborum significatu ''De verborum significatione libri XX'' ('Twenty Books on the Meaning of Words'), also known as the ''Lexicon of Festus'', is an epitome compiled, edited, and annotated by Sextus Pompeius Festus from the Encyclopedia, encyclopedic works of Verrius ...
'', with entries in alphabetic order. In the 3rd century CE,
Harpocration__NOTOC__ Valerius Harpocration ( grc-gre, Οὐαλέριος or , ''gen''. Ἁρποκρατίωνος) was a Greek Greek may refer to: Greece Anything of, from, or related to Greece Greece ( el, Ελλάδα, , ), officially the Hellenic R ...
wrote a
Homer Homer (; grc, Ὅμηρος , ''Hómēros'') was the presumed author of the ''Iliad'' and the ''Odyssey'', two epic poems that are the foundational works of ancient Greek literature. The ''Iliad'' is set during the Trojan War, the ten-year s ...

ic lexicon alphabetized by all letters. In the 10th century, the author of the ''
Suda First page of an early printed edition of the ''Suda'' The ''Suda'' or ''Souda'' (; grc-x-medieval, Σοῦδα, Soûda; la, Suidae Lexicon) is a large 10th-century Byzantine encyclopedia An encyclopedia or encyclopaedia (British Engli ...

'' used alphabetic order with phonetic variations. Alphabetical order as an aid to consultation started to enter the mainstream of
Western Europe Western Europe is the western region of Europe. The region's countries and territories vary depending on context. Beginning with foreign exploration during the Age of Discovery, roughly from the 15th century, the concept of ''Europe'' as "the W ...

an intellectual life in the second half of the 12th century, when alphabetical tools were developed to help preachers analyse Bible, biblical vocabulary. This led to the compilation of alphabetical Concordance (publishing), concordances of the Bible by the Dominican friars in Paris in the 13th century, under Hugh of Saint Cher. Older reference works such as St. Jerome's ''Interpretations of Hebrew Names'' were alphabetized for ease of consultation. The use of alphabetical order was initially resisted by scholars, who expected their students to master their area of study according to its own rational structures; its success was driven by such tools as Robert Kilwardby's index to the works of St. Augustine, which helped readers access the full original text instead of depending on the compilations of Quotation, excerpts which had become prominent in 12th century scholasticism. The adoption of alphabetical order was part of the transition from the primacy of memory to that of written works. The idea of ordering information by the order of the alphabet also met resistance from the compilers of encyclopaedias in 12th and 13th centuries, who were all devout churchmen. They preferred to organise their material theologically – in the order of God's creation, starting with ''Deus'' (meaning God). In 1604 Robert Cawdrey had to explain in ''Table Alphabeticall'', the first monolingual English dictionary, "Nowe if the word, which thou art desirous to finde, begin with (a) then looke in the beginning of this Table, but if with (v) looke towards the end". Although as late as 1803 Samuel Taylor Coleridge condemned encyclopedias with "an arrangement determined by the accident of initial letters", many lists are today based on this principle. Arrangement in alphabetical order can be seen as a force for democratising access to information, as it does not require extensive prior knowledge to find what was needed.

# Ordering in the Latin script

## Basic order and examples

The standard order of the modern ISO basic Latin alphabet is: :A-B-C-D-E-F-G-H-I-J-K-L-M-N-O-P-Q-R-S-T-U-V-W-X-Y-Z An example of straightforward alphabetical ordering follows: *''As; Aster; Astrolabe; Astronomy; Astrophysics; At; Ataman; Attack; Baa'' Another example: *''Barnacle; Be; Been; Benefit; Bent'' The above words are ordered alphabetically. ''As'' comes before ''Aster'' because they begin with the same two letters and ''As'' has no more letters after that whereas ''Aster'' does. The next three words come after ''Aster'' because their fourth letter (the first one that differs) is ''r'', which comes after ''e'' (the fourth letter of ''Aster'') in the alphabet. Those words themselves are ordered based on their sixth letters (''l'', ''n'' and ''p'' respectively). Then comes ''At'', which differs from the preceding words in the second letter (''t'' comes after ''s''). ''Ataman'' comes after ''At'' for the same reason that ''Aster'' came after ''As''. ''Attack'' follows ''Ataman'' based on comparison of their third letters, and ''Baa'' comes after all of the others because it has a different first letter.

## Treatment of multiword strings

When some of the strings being ordered consist of more than one word, i.e., they contain space (character), spaces or other separators such as hyphens, then two basic approaches may be taken. In the first approach, all strings are ordered initially according to their first word, as in the sequence: *''Oak; Oak Hill; Oak Ridge; Oakley Park; Oakley River'' *:where all strings beginning with the separate word ''Oak'' precede all those beginning ''Oakley'', because ''Oak'' precedes ''Oakley'' in alphabetical order. In the second approach, strings are alphabetized as if they had no spaces, giving the sequence: *''Oak; Oak Hill; Oakley Park; Oakley River; Oak Ridge'' *:where ''Oak Ridge'' now comes after the ''Oakley'' strings, as it would if it were written "Oakridge". The second approach is the one usually taken in dictionaries, and it is thus often called ''dictionary order (disambiguation), dictionary order'' by publishing, publishers. The first approach has often been used in index (publishing), book indexes, although each publisher traditionally set its own standards for which approach to use therein; there was no ISO standard for book indexes (ISO 999) before 1975.

## Special cases

### Modified letters

In French, modified letters (such as those with
diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter Letter, letters, or literature may refer to: Characters typeface * Letter (alphabet) A letter is a segmental symbol A sy ...
s) are treated the same as the base letter for alphabetical ordering purposes. For example, ''rôle'' comes between ''rock'' and ''rose'', as if it were written ''role''. However languages that use such letters systematically generally have their own ordering rules. See #Language-specific conventions, Language-specific conventions below.

### Ordering by surname

In most cultures where family names are written after given names, it is still desired to sort lists of names (as in telephone directories) by family name first. In this case, names need to be reordered to be sorted properly. For example, Juan Hernandes and Brian O'Leary should be sorted as "Hernandes, Juan" and "O'Leary, Brian" even if they are not written this way. Capturing this rule in a computer collation algorithm is difficult, and simple attempts will necessarily fail. For example, unless the algorithm has at its disposal an extensive list of family names, there is no way to decide if "Gillian Lucille van der Waal" is "van der Waal, Gillian Lucille", "Waal, Gillian Lucille van der", or even "Lucille van der Waal, Gillian". Ordering by surname is frequently encountered in academic contexts. Within a single multi-author paper, ordering the authors alphabetically by surname, rather than by other methods such as reverse seniority or subjective degree of contribution to the paper, is seen as a way of "acknowledg[ing] similar contributions" or "avoid[ing] disharmony in collaborating groups". The practice in certain fields of ordering citations in bibliographies by the surnames of their authors has been found to create bias in favour of authors with surnames which appear earlier in the alphabet, while this effect does not appear in fields in which bibliographies are ordered chronologically.

### ''The'' and other common words

If a phrase begins with a very common word (such as "the", "a" or "an", called articles in grammar), that word is sometimes ignored or moved to the end of the phrase, but this is not always the case. For example, the book "The Shining (novel), The Shining" might be treated as "Shining", or "Shining, The" and therefore before the book title "Summer of Sam", although it may also be treated as simply "The Shining" and after "Summer of Sam". Similarly, "A Wrinkle in Time" might be treated as "Wrinkle in Time", "Wrinkle in Time, A", or "A Wrinkle in Time". All three alphabetization methods are fairly easy to create by algorithm, but many programs rely instead on simple lexicographic ordering.

### ''Mac'' prefixes

The prefixes ''M and ''Mc'' in Irish and Scottish surnames are abbreviations for ''Mac'', and are sometimes alphabetized as if the spelling is ''Mac'' in full. Thus ''McKinley'' might be listed before ''Mackintosh'' (as it would be if it had been spelled out as "MacKinley"). Since the advent of computer-sorted lists, this type of alphabetization is less frequently encountered, though it is still used in British telephone directories.

### Ligatures

Typographic ligature, Ligatures (two or more letters merged into one symbol) which are not considered distinct letters, such as Æ and Œ in English, are typically collated as if the letters were separate—"æther" and "aether" would be ordered the same relative to all other words. This is true even when the ligature is not purely stylistic, such as in loanwords and brand names. Special rules may need to be adopted to sort strings which vary only by whether two letters are joined by a ligature.

## Treatment of numerals

When some of the strings contain Numerical digit, numerals (or other non-letter characters), various approaches are possible. Sometimes such characters are treated as if they came before or after all the letters of the alphabet. Another method is for numbers to be sorted alphabetically as they would be spelled: for example ''1776 (film), 1776'' would be sorted as if spelled out "seventeen seventy-six", and ''24 heures du Mans'' as if spelled "vingt-quatre..." (French for "twenty-four"). When numerals or other symbols are used as special graphical forms of letters, as ''1337'' for leet or the movie ''Seven (1995 film), Seven'' (which was stylised as ''Se7en''), they may be sorted as if they were those letters. Natural sort order orders strings alphabetically, except that multi-digit numbers are treated as a single character and ordered by the value of the number encoded by the digits.

# Automation

Collation algorithms (in combination with sorting algorithms) are used in computer programming to place strings in alphabetical order. A standard example is the Unicode Collation Algorithm, which can be used to put strings containing any Unicode symbols into (an extension of) alphabetical order. It can be made to conform to most of the language-specific conventions described above by tailoring its default collation table. Several such tailorings are collected in Common Locale Data Repository.

# Similar orderings

The principle behind alphabetical ordering can still be applied in languages that do not strictly speaking use an
alphabet An alphabet is a standardized set of basic written symbols A symbol is a mark, sign, or word In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic ...

– for example, they may be written using a syllabary or abugida – provided the symbols used have an established ordering. For logographic writing systems, such as Chinese hanzi or Japanese kanji, the method of radical-and-stroke sorting is frequently used as a way of defining an ordering on the symbols. Japanese sometimes uses pronunciation order, most commonly with the Gojūon order but sometimes with the older Iroha ordering. In mathematics,
lexicographical order In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
is a means of ordering sequences in a manner analogous to that used to produce alphabetical order. Some computer applications use a version of alphabetical order that can be achieved using a very simple algorithm, based purely on the ASCII or Unicode codes for characters. This may have non-standard effects such as placing all capital letters before lower-case ones. See ASCIIbetical order. A rhyming dictionary is based on sorting words in alphabetical order starting from the last to the first letter of the word.

*Collation *Help:Alphabetical order *Sorting