HOME

TheInfoList



OR:

COBOL (; an
acronym An acronym is a type of abbreviation consisting of a phrase whose only pronounced elements are the initial letters or initial sounds of words inside that phrase. Acronyms are often spelled with the initial Letter (alphabet), letter of each wor ...
for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language. COBOL is primarily used in business, finance, and administrative systems for companies and governments. COBOL is still widely used in applications deployed on
mainframe computer A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...
s, such as large-scale batch and
transaction processing In computer science, transaction processing is information processing that is divided into individual, indivisible operations called ''transactions''. Each transaction must succeed or fail as a complete unit; it can never be only partially c ...
jobs. Many large financial institutions were developing new systems in the language as late as 2006, but most programming in COBOL today is purely to maintain existing applications. Programs are being moved to new platforms, rewritten in modern languages, or replaced with other software. COBOL was designed in 1959 by CODASYL and was partly based on the programming language
FLOW-MATIC FLOW-MATIC, originally known as B-0 (Business Language version 0), was the first English-like data processing language. It was developed for the UNIVAC I at Remington Rand Remington Rand, Inc. was an early American business machine manufactu ...
, designed by Grace Hopper. It was created as part of a U.S. Department of Defense effort to create a portable programming language for data processing. It was originally seen as a stopgap, but the Defense Department promptly pressured computer manufacturers to provide it, resulting in its widespread adoption. It was
standardized Standardization (American English) or standardisation (British English) is the process of implementing and developing technical standards based on the consensus of different parties that include firms, users, interest groups, standards organiza ...
in 1968 and has been revised five times. Expansions include support for
structured Structuring, also known as smurfing in banking jargon, is the practice of executing financial transactions such as making bank deposits in a specific pattern, calculated to avoid triggering financial institutions to file reports required by law ...
and
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
. The current standard is
ISO The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. Me ...
/ IEC 1989:2023. COBOL statements have
prose Prose is language that follows the natural flow or rhythm of speech, ordinary grammatical structures, or, in writing, typical conventions and formatting. Thus, prose ranges from informal speaking to formal academic writing. Prose differs most n ...
syntax such as , which was designed to be self-documenting and highly readable. However, it is verbose and uses over 300
reserved word In a programming language, a reserved word (sometimes known as a reserved identifier) is a word that cannot be used by a programmer as an identifier, such as the name of a variable, function, or label – it is "reserved from use". In brief, an '' ...
s compared to the succinct and mathematically inspired syntax of other languages. The COBOL code is split into four ''divisions'' (identification, environment, data, and procedure), containing a rigid hierarchy of sections, paragraphs, and sentences. Lacking a large
standard library In computer programming, a standard library is the library (computing), library made available across Programming language implementation, implementations of a programming language. Often, a standard library is specified by its associated program ...
, the standard specifies 43 statements, 87 functions, and just one class. COBOL has been criticized for its verbosity, design process, and poor support for
structured programming Structured programming is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by making specific disciplined use of the structured control flow constructs of selection ( if/then/else) and repet ...
. These weaknesses often result in monolithic programs that are hard to comprehend as a whole, despite their local readability. For years, COBOL has been assumed as a programming language for business operations in mainframes, although in recent years, many COBOL operations have been moved to
cloud computing Cloud computing is "a paradigm for enabling network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand," according to International Organization for ...
.


History and specification


Background

In the late 1950s, computer users and manufacturers were becoming concerned about the rising cost of programming. A 1959 survey had found that in any data processing installation, the programming cost US$800,000 on average and that translating programs to run on new hardware would cost US$600,000. At a time when new programming languages were proliferating, the same survey suggested that if a common business-oriented language were used, conversion would be far cheaper and faster. On 8 April 1959, Mary K. Hawes, a computer scientist at
Burroughs Corporation The Burroughs Corporation was a major American manufacturer of business equipment. The company was founded in 1886 as the American Arithmometer Company by William Seward Burroughs I, William Seward Burroughs. The company's history paralleled many ...
, called a meeting of representatives from academia, computer users, and manufacturers at the
University of Pennsylvania The University of Pennsylvania (Penn or UPenn) is a Private university, private Ivy League research university in Philadelphia, Pennsylvania, United States. One of nine colonial colleges, it was chartered in 1755 through the efforts of f ...
to organize a formal meeting on common business languages. Representatives included Grace Hopper (inventor of the English-like data processing language
FLOW-MATIC FLOW-MATIC, originally known as B-0 (Business Language version 0), was the first English-like data processing language. It was developed for the UNIVAC I at Remington Rand Remington Rand, Inc. was an early American business machine manufactu ...
), Jean Sammet, and Saul Gorn. At the April meeting, the group asked the Department of Defense (DoD) to sponsor an effort to create a common business language. The delegation impressed Charles A. Phillips, director of the Data System Research Staff at the DoD, who thought that they "thoroughly understood" the DoD's problems. The DoD operated 225 computers, had 175 more on order, and had spent over $200 million on implementing programs to run on them. Portable programs would save time, reduce costs, and ease modernization. Charles Phillips agreed to sponsor the meeting, and tasked the delegation with drafting the agenda.


COBOL 60

On 28 and 29 May 1959, a meeting was held at
the Pentagon The Pentagon is the headquarters building of the United States Department of Defense, in Arlington County, Virginia, across the Potomac River from Washington, D.C. The building was constructed on an accelerated schedule during World War II. As ...
to discuss the creation of a common programming language for business (exactly one year after the Zürich ALGOL 58 meeting). It was attended by 41 people and was chaired by Phillips. The Department of Defense was concerned about whether it could run the same data processing programs on different computers. FORTRAN, the only mainstream language at the time, lacked the features needed to write such programs. Representatives enthusiastically described a language that could work in a wide variety of environments, from banking and insurance to utilities and inventory control. They agreed unanimously that more people should be able to program, and that the new language should not be restricted by the limitations of contemporary technology. A majority agreed that the language should make maximal use of English, be capable of change, be machine-independent, and be easy to use, even at the expense of power. The meeting resulted in the creation of a
steering committee A committee or commission is a body of one or more persons subordinate to a deliberative assembly or other form of organization. A committee may not itself be considered to be a form of assembly or a decision-making body. Usually, an assembly o ...
and short, intermediate, and long-range committees. The short-range committee was given until September (three months) to produce specifications for an interim language, which would then be improved upon by the other committees. Their official mission, however, was to identify the strengths and weaknesses of existing programming languages; it did not explicitly direct them to create a new language. The deadline was met with disbelief by the short-range committee. One member, Betty Holberton, described the three-month deadline as "gross optimism" and doubted that the language really would be a stopgap. The steering committee met on 4 June and agreed to name the entire activity the ''Committee on Data Systems Languages'', or CODASYL, and to form an executive committee. The short-range committee members represented six computer manufacturers and three government agencies. The computer manufacturers were
Burroughs Corporation The Burroughs Corporation was a major American manufacturer of business equipment. The company was founded in 1886 as the American Arithmometer Company by William Seward Burroughs I, William Seward Burroughs. The company's history paralleled many ...
,
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
, Minneapolis-Honeywell (Honeywell Labs), RCA,
Sperry Rand Sperry Corporation was a major American equipment and electronics company whose existence spanned more than seven decades of the 20th century. Sperry ceased to exist in 1986 following a prolonged hostile takeover bid engineered by Burroughs ...
, and Sylvania Electric Products. The government agencies were the U.S. Air Force, the Navy's David Taylor Model Basin, and the National Bureau of Standards (now the National Institute of Standards and Technology). The committee was chaired by Joseph Wegstein of the U.S. National Bureau of Standards. Work began by investigating data descriptions, statements, existing applications, and user experiences. The committee mainly examined the
FLOW-MATIC FLOW-MATIC, originally known as B-0 (Business Language version 0), was the first English-like data processing language. It was developed for the UNIVAC I at Remington Rand Remington Rand, Inc. was an early American business machine manufactu ...
, AIMACO, and COMTRAN programming languages. The FLOW-MATIC language was particularly influential because it had been implemented and because AIMACO was a derivative of it with only minor changes. FLOW-MATIC's inventor, Grace Hopper, also served as a technical adviser to the committee. FLOW-MATIC's major contributions to COBOL were long variable names, English words for commands, and the separation of data descriptions and instructions. Hopper is sometimes called "the mother of COBOL" or "the grandmother of COBOL", although Jean Sammet, a lead designer of COBOL, said Hopper "was not the mother, creator, or developer of Cobol." IBM's COMTRAN language, invented by Bob Bemer, was regarded as a competitor to FLOW-MATIC by a short-range committee made up of colleagues of Grace Hopper. Some of its features were not incorporated into COBOL so that it would not look like IBM had dominated the design process, and Jean Sammet said in 1981 that there had been a "strong anti-IBM bias" from some committee members (herself included). In one case, after Roy Goldfinger, author of the COMTRAN manual and intermediate-range committee member, attended a subcommittee meeting to support his language and encourage the use of algebraic expressions, Grace Hopper sent a memo to the short-range committee reiterating Sperry Rand's efforts to create a language based on English. In 1980, Grace Hopper commented that "COBOL 60 is 95% FLOW-MATIC" and that COMTRAN had had an "extremely small" influence. Furthermore, she said that she would claim that work was influenced by both FLOW-MATIC and COMTRAN only to "keep other people happy o theywouldn't try to knock us out.". Features from COMTRAN incorporated into COBOL included formulas, the clause, an improved IF statement, which obviated the need for GO TOs, and a more robust file management system. The usefulness of the committee's work was a subject of great debate. While some members thought the language had too many compromises and was the result of design by committee, others felt it was better than the three languages examined. Some felt the language was too complex; others, too simple. Controversial features included those some considered useless or too advanced for data processing users. Such features included
Boolean expression In computer science, a Boolean expression (also known as logical expression) is an expression used in programming languages that produces a Boolean value when evaluated. A Boolean value is either true or false. A Boolean expression may be compos ...
s,
formula In science, a formula is a concise way of expressing information symbolically, as in a mathematical formula or a ''chemical formula''. The informal use of the term ''formula'' in science refers to the general construct of a relationship betwe ...
s, and table ' (indices). Another point of controversy was whether to make keywords context-sensitive and the effect that would have on readability. Although context-sensitive keywords were rejected, the approach was later used in
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
and partially in COBOL from 2002. Little consideration was given to
interactivity Across the many fields concerned with interactivity, including information science, computer science, human-computer interaction, communication, and industrial design, there is little agreement over the meaning of the term "interactivity", but ...
, interaction with
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s (few existed at that time), and functions (thought of as purely mathematical and of no use in data processing). The specifications were presented to the executive committee on 4 September. They fell short of expectations: Joseph Wegstein noted that "it contains rough spots and requires some additions," and Bob Bemer later described them as a "hodgepodge." The committee was given until December to improve it. At a mid-September meeting, the committee discussed the new language's name. Suggestions included "BUSY" (Business System), "INFOSYL" (Information System Language), and "COCOSYL" (Common Computer Systems Language). It is unclear who coined the name "COBOL", although Bob Bemer later claimed it had been his suggestion. In October, the intermediate-range committee received copies of the
FACT A fact is a truth, true data, datum about one or more aspects of a circumstance. Standard reference works are often used to Fact-checking, check facts. Science, Scientific facts are verified by repeatable careful observation or measurement by ...
language specification created by Roy Nutt. Its features impressed the committee so much that they passed a resolution to base COBOL on it. This was a blow to the short-range committee, who had made good progress on the specification. Despite being technically superior, FACT had not been created with portability in mind or through manufacturer and user consensus. It also lacked a demonstrable implementation, allowing supporters of a FLOW-MATIC-based COBOL to overturn the resolution. RCA representative Howard Bromberg also blocked FACT, so that RCA's work on a COBOL implementation would not go to waste. It soon became apparent that the committee was too large to make any further progress quickly. A frustrated Howard Bromberg bought a $15 tombstone with "COBOL" engraved on it and sent it to Charles Phillips to demonstrate his displeasure. A subcommittee was formed to analyze existing languages and was made up of six individuals: * William Selden and Gertrude Tierney of IBM, * Howard Bromberg and Howard Discount of RCA, * Vernon Reeves and Jean E. Sammet of Sylvania Electric Products. The subcommittee did most of the work creating the specification, leaving the short-range committee to review and modify their work before producing the finished specification. The specifications were approved by the executive committee on 8 January 1960, and sent to the government printing office, which printed them as ''COBOL 60''. The language's stated objectives were to allow efficient, portable programs to be easily written, to allow users to move to new systems with minimal effort and cost, and to be suitable for inexperienced programmers. The CODASYL Executive Committee later created the COBOL Maintenance Committee to answer questions from users and vendors and to improve and expand the specifications. During 1960, the list of manufacturers planning to build COBOL compilers grew. By September, five more manufacturers had joined CODASYL ( Bendix,
Control Data Corporation Control Data Corporation (CDC) was a mainframe and supercomputer company that in the 1960s was one of the nine major U.S. computer companies, which group included IBM, the Burroughs Corporation, and the Digital Equipment Corporation (DEC), the N ...
,
General Electric General Electric Company (GE) was an American Multinational corporation, multinational Conglomerate (company), conglomerate founded in 1892, incorporated in the New York (state), state of New York and headquartered in Boston. Over the year ...
(GE), National Cash Register, and
Philco Philco (an acronym for Philadelphia Battery Company) is an American electronics industry, electronics manufacturer headquartered in Philadelphia. Philco was a pioneer in battery, radio, and television production. In 1961, the company was purchase ...
), and all represented manufacturers had announced COBOL compilers. GE and IBM planned to integrate COBOL into their own languages, GECOM and COMTRAN, respectively. In contrast,
International Computers and Tabulators International Computers and Tabulators or ICT was a British computer manufacturer, formed in 1959 by a merger of the British Tabulating Machine Company (BTM) and Powers-Samas. In 1963 it acquired the business computer divisions of Ferranti. It ...
planned to replace their language, CODEL, with COBOL. Meanwhile, RCA and Sperry Rand worked on creating COBOL compilers. The first COBOL program ran on 17 August on an RCA 501. On 6 and 7 December, the same COBOL program (albeit with minor changes) ran on an RCA computer and a Remington-Rand
Univac UNIVAC (Universal Automatic Computer) was a line of electronic digital stored-program computers starting with the products of the Eckert–Mauchly Computer Corporation. Later the name was applied to a division of the Remington Rand company and ...
computer, demonstrating that compatibility could be achieved. The relative influence of the languages that were used is still indicated in the recommended advisory printed in all COBOL reference manuals:


COBOL-61 to COBOL-65

Many logical flaws were found in ''COBOL 60'', leading General Electric's
Charles Katz Charles Abraham Katz (July 7, 1927 – May 9, 1974) was an American mathematician and computer scientist known for his contributions to early compiler development in the 1950s. Katz received two degrees in mathematics, a Bachelor of Science (B.S ...
to warn that it could not be interpreted unambiguously. A reluctant short-term committee performed a total cleanup, and, by March 1963, it was reported that COBOL's syntax was as definable as
ALGOL ALGOL (; short for "Algorithmic Language") is a family of imperative computer programming languages originally developed in 1958. ALGOL heavily influenced many other languages and was the standard method for algorithm description used by the ...
's, although semantic ambiguities remained. Early COBOL compilers were primitive and slow. COBOL is a difficult language to write a compiler for, due to the large syntax and many optional elements within syntactic constructs, as well as the need to generate efficient code for a language with many possible data representations, implicit type conversions, and necessary set-ups for I/O operations. A 1962 US Navy evaluation found compilation speeds of 3–11 statements per minute. By mid-1964, they had increased to 11–1000 statements per minute. It was observed that increasing memory would drastically increase speed and that compilation costs varied wildly: costs per statement were between $0.23 and $18.91. In late 1962, IBM announced that COBOL would be their primary development language and that development of COMTRAN would cease. The COBOL specification was revised three times in the five years after its publication. COBOL-60 was replaced in 1961 by COBOL-61. This was then replaced by the COBOL-61 Extended specifications in 1963, which introduced the sort and report writer facilities. The added facilities corrected flaws identified by Honeywell in late 1959 in a letter to the short-range committee. COBOL Edition 1965 brought further clarifications to the specifications and introduced facilities for handling
mass storage In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion. In general, the term ''mass'' in ''mass storage'' is used to mean ''large'' in relation to contemporaneous hard disk drive ...
files and tables.


COBOL-68

Efforts began to standardize COBOL to overcome incompatibilities between versions. In late 1962, both ISO and the United States of America Standards Institute (now
ANSI The American National Standards Institute (ANSI ) is a private nonprofit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organiz ...
) formed groups to create standards. ANSI produced ''USA Standard COBOL X3.23'' in August 1968, which became the cornerstone for later versions. This version was known as American National Standard (ANS) COBOL and was adopted by ISO in 1972.


COBOL-74

By 1970, COBOL had become the most widely used programming language in the world. Independently of the ANSI committee, the CODASYL Programming Language Committee was working on improving the language. They described new versions in 1968, 1969, 1970, and 1973, including changes such as new inter-program communication, debugging, and file merging facilities, as well as improved string handling and
library A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
inclusion features. Although CODASYL was independent of the ANSI committee, the ''CODASYL Journal of Development'' was used by ANSI to identify features that were popular enough to warrant implementing. The Programming Language Committee also liaised with ECMA and the Japanese COBOL Standard committee. The Programming Language Committee was not well-known, however. The vice president, William Rinehuls, complained that two-thirds of the COBOL community did not know of the committee's existence. It also lacked the funds to make public documents, such as minutes of meetings and change proposals, freely available. In 1974, ANSI published a revised version of (ANS) COBOL, containing new features such as file organizations, the statement and the segmentation module. Deleted features included the statement, the statement (which was replaced by ), and the implementer-defined random access module (which was superseded by the new sequential and relative I/O modules). These made up 44 changes, which rendered existing statements incompatible with the new standard. The report writer was slated to be removed from COBOL but was reinstated before the standard was published. ISO later adopted the updated standard in 1978.


COBOL-85

In June 1978, work began on revising COBOL-74. The proposed standard (commonly called COBOL-80) differed significantly from the previous one, causing concerns about incompatibility and conversion costs. In January 1981, Joseph T. Brophy, Senior Vice-president of Travelers Insurance, threatened to sue the standard committee because it was not upwards compatible with COBOL-74. Mr. Brophy described previous conversions of their 40-million-line code base as "non-productive" and a "complete waste of our programmer resources". Later that year, the Data Processing Management Association (DPMA) said it was "strongly opposed" to the new standard, citing "prohibitive" conversion costs and enhancements that were "forced on the user". During the first public review period, the committee received 2,200 responses, of which 1,700 were negative form letters. Other responses were detailed analyses of the effect COBOL-80 would have on their systems; conversion costs were predicted to be at least 50 cents per line of code. Fewer than a dozen of the responses were in favor of the proposed standard. ISO TC97-SC5 installed in 1979 the international COBOL Experts Group, on initiative of Wim Ebbinkhuijsen. The group consisted of COBOL experts from many countries, including the United States. Its goal was to achieve mutual understanding and respect between ANSI and the rest of the world with regard to the need of new COBOL features. After three years, ISO changed the status of the group to a formal Working Group: WG 4 COBOL. The group took primary ownership and development of the COBOL standard, where ANSI made most of the proposals. In 1983, the DPMA withdrew its opposition to the standard, citing the responsiveness of the committee to public concerns. In the same year, a National Bureau of Standards study concluded that the proposed standard would present few problems. A year later, DEC released a VAX/VMS COBOL-80, and noted that conversion of COBOL-74 programs posed few problems. The new EVALUATE statement and inline PERFORM were particularly well received and improved productivity, thanks to simplified
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an '' ...
and
debugging In engineering, debugging is the process of finding the Root cause analysis, root cause, workarounds, and possible fixes for bug (engineering), bugs. For software, debugging tactics can involve interactive debugging, control flow analysis, Logf ...
. The second public review drew another 1,000 (mainly negative) responses, while the last drew just 25, by which time many concerns had been addressed. In 1985, the ISO Working Group 4 accepted the then-version of the ANSI proposed standard, made several changes and set it as the new ISO standard COBOL 85. It was published in late 1985. Sixty features were changed or deprecated and 115 were added, such as: * Scope terminators (END-IF, END-PERFORM, END-READ, etc.) * Nested subprograms * CONTINUE, a no-operation statement * EVALUATE, a
switch statement In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map. Switch statements function ...
* INITIALIZE, a statement that can set groups of data to their default values * Inline PERFORM loop bodies – previously, loop bodies had to be specified in a separate procedure * Reference modification, which allows access to substrings * I/O status codes. The new standard was adopted by all national standard bodies, including ANSI. Two amendments followed in 1989 and 1993. The first amendment introduced intrinsic functions and the other provided corrections.


COBOL 2002 and object-oriented COBOL

In 1997, Gartner Group estimated that there were a total of 200 billion lines of COBOL in existence, which ran 80% of all business programs. In the early 1990s, work began on adding
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
in the next full revision of COBOL. Object-oriented features were taken from C++ and
Smalltalk Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
. The initial estimate was to have this revision completed by 1997, and an ISO Committee Draft (CD) was available by 1997. Some vendors (including Micro Focus, Fujitsu, and
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
) introduced object-oriented syntax based on drafts of the full revision. The final approved ISO standard was approved and published in late 2002. Fujitsu/GTSoftware, Micro Focus introduced object-oriented COBOL compilers targeting the .NET Framework. There were many other new features, many of which had been in the ''CODASYL COBOL Journal of Development'' since 1978 and had missed the opportunity to be included in COBOL-85. These other features included: * Free-form code * User-defined functions *
Recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
* Locale-based processing * Support for extended character sets such as
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
*
Floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
and binary data types (until then, binary items were truncated based on their declaration's base-10 specification) * Portable arithmetic results * Bit and Boolean data types * Pointers and syntax for getting and freeing storage * The for
text-based user interface In computing, text-based user interfaces (TUI) (alternately terminal user interfaces, to reflect a dependence upon the properties of computer terminals and not just text), is a retronym describing a type of user interface (UI) common as an ear ...
s * The facility * Improved interoperability with other programming languages and framework environments such as .NET and
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
. Three
corrigenda An erratum or corrigendum (: errata, corrigenda) (comes from ) is a correction of a published text. Generally, publishers issue an erratum for a production error (i.e., an error introduced during the publishing process) and a corrigendum for an a ...
were published for the standard: two in 2006 and one in 2009.


COBOL 2014

Between 2003 and 2009, three Technical Reports (TRs) were produced describing object finalization,
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
processing and collection classes for COBOL. COBOL 2002 suffered from poor support: no compilers completely supported the standard. Micro Focus found that it was due to a lack of user demand for the new features and due to the abolition of the
NIST The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical s ...
test suite, which had been used to test compiler conformance. The standardization process was also found to be slow and under-resourced. COBOL 2014 includes the following changes: * Major features have been made optional, such as the VALIDATE facility, the report writer and the screen-handling facility * Dynamic capacity tables (a feature dropped from the draft of COBOL 2002) * Portable arithmetic results have been replaced by
IEEE 754 The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point arithmetic originally established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). The standard #Design rationale, add ...
data types * Method overloading


COBOL 2023

The COBOL 2023 standard added a few new features: * Asynchronous messaging syntax using the SEND and RECEIVE statements * A
transaction processing In computer science, transaction processing is information processing that is divided into individual, indivisible operations called ''transactions''. Each transaction must succeed or fail as a complete unit; it can never be only partially c ...
facility with COMMIT and ROLLBACK * XOR logical operator * The CONTINUE statement can be extended as to pause the program for a specified duration * A DELETE FILE statement * LINE SEQUENTIAL file organization * Defined infinite looping with PERFORM UNTIL EXIT * SUBSTITUTE intrinsic function allowing for substring substitution of different length * CONVERT function for base-conversion * Boolean shifting operators There is as yet no known complete implementation of this standard.


Legacy

COBOL programs are used globally in governments and various industries including retail, travel, finance, and healthcare. Testimony before the
House of Representatives House of Representatives is the name of legislative bodies in many countries and sub-national entities. In many countries, the House of Representatives is the lower house of a bicameral legislature, with the corresponding upper house often ...
in 2016 indicated that COBOL is still in use by many federal agencies. COBOL currently runs on diverse operating systems such as
z/OS z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions.Starting with the earliest: ...
, z/VSE, VME,
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
, NonStop OS,
OpenVMS OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Op ...
and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
. In 1997, the Gartner Group reported that 80% of the world's business ran on COBOL with over 200 billion lines of code and 5 billion lines more being written annually. As of 2020, COBOL ran background processes 95% of the time a credit or debit card was swiped.


Y2K

Near the end of the 20th century, the
year 2000 problem The term year 2000 problem, or simply Y2K, refers to potential computer errors related to the Time formatting and storage bugs, formatting and storage of calendar data for dates in and after the year 2000. Many Computer program, programs repr ...
(Y2K) was the focus of significant COBOL programming effort, sometimes by the same programmers who had designed the systems decades before. The particular level of effort required to correct COBOL code has been attributed to the large amount of business-oriented COBOL, as business applications use dates heavily, and to fixed-length data fields. Some studies attribute as much as "24% of Y2K software repair costs to Cobol". After the clean-up effort put into these programs for Y2K, a 2003 survey found that many remained in use. The authors said that the survey data suggest "a gradual decline in the importance of COBOL in application development over the ollowing10 years unless ... integration with other languages and technologies can be adopted".


Modernization efforts

In 2006 and 2012, ''
Computerworld ''Computerworld'' (abbreviated as CW) is a computer magazine published since 1967 aimed at information technology (IT) and Business computing, business technology professionals. Original a print magazine, ''Computerworld'' published its final pr ...
'' surveys (of 352 readers) found that over 60% of organizations used COBOL (more than C++ and Visual Basic .NET) and that for half of those, COBOL was used for the majority of their internal software. 36% of managers said they planned to migrate from COBOL, and 25% said that they would do so if not for the expense of rewriting legacy code. Alternatively, some businesses have migrated their COBOL programs from mainframes to cheaper, faster hardware. By 2019, the number of COBOL programmers was shrinking fast due to retirements, leading to an impending skills gap in business and government organizations which still use mainframe systems for high-volume transaction processing. Efforts to rewrite COBOL systems in newer languages have proven expensive and problematic, as has the outsourcing of code maintenance, thus proposals to train more people in COBOL are advocated. Several banks have undertaken multi-year COBOL modernization efforts, sometimes resulting in widespread service disruptions that result in fines. During the
COVID-19 pandemic The COVID-19 pandemic (also known as the coronavirus pandemic and COVID pandemic), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), began with an disease outbreak, outbreak of COVID-19 in Wuhan, China, in December ...
and the ensuing surge of unemployment, several US states reported a shortage of skilled COBOL programmers to support the legacy systems used for unemployment benefit management. Many of these systems had been in the process of conversion to more modern programming languages prior to the pandemic, but the process was put on hold. Similarly, the US
Internal Revenue Service The Internal Revenue Service (IRS) is the revenue service for the Federal government of the United States, United States federal government, which is responsible for collecting Taxation in the United States, U.S. federal taxes and administerin ...
rushed to patch its COBOL-based Individual Master File in order to disburse the tens of millions of payments mandated by the Coronavirus Aid, Relief, and Economic Security Act.


Features


Syntax

COBOL has an English-like syntax, which is used to describe nearly everything in COBOL programs. For example, a condition can be expressed as   or more concisely as    or  . More complex conditions can be abbreviated by removing repeated conditions and variables. For example,    can be shortened to . To support this syntax, COBOL has over 300 keywords. Some of the keywords are simple alternative or pluralized spellings of the same word, which provides for more grammatically appropriate statements and clauses; e.g., the and keywords can be used interchangeably, as can and , and and . Each COBOL program is made up of four basic lexical items: words, literals, picture character-strings (see ) and separators. Words include reserved words and user-defined identifiers. They are up to 31 characters long and may include letters, digits, hyphens and underscores. Literals include numerals (e.g. ) and strings (e.g. ). Separators include the space character and commas and semi-colons followed by a space. A COBOL program is split into four divisions: the identification division, the environment division, the data division and the procedure division. The identification division specifies the name and type of the source element and is where classes and interfaces are specified. The environment division specifies any program features that depend on the system running it, such as files and
character sets Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical v ...
. The data division is used to declare variables and
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s. The procedure division contains the program's statements. Each division is sub-divided into sections, which are made up of paragraphs.


Metalanguage

COBOL's syntax is usually described with a unique metalanguage using braces, brackets, bars and underlining. The metalanguage was developed for the original COBOL specifications. As an example, consider the following description of an ADD statement:
\begin \underline\, \begin \text \\ \text \end\dots \;\underline\,\left\\dots \\ em \quad \left \begin \text\,\underline\,\underline\,\text \\ \underline\,\text\,\underline\,\underline\,\text \end\\right\\ em \quad \left ,\underline\,\right \end
This description permits the following variants: ADD 1 TO x ADD 1, a, b TO x ROUNDED, y, z ROUNDED ADD a, b TO c ON SIZE ERROR DISPLAY "Error" END-ADD ADD a TO b NOT SIZE ERROR DISPLAY "No error" ON SIZE ERROR DISPLAY "Error"


Code format

The height of COBOL's popularity coincided with the era of keypunch machines and
punched card A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were widel ...
s. The program itself was written onto punched cards, then read in and compiled, and the data fed into the program was sometimes on cards as well. COBOL can be written in two formats: fixed (the default) or free. In fixed-format, code must be aligned to fit in certain areas (a hold-over from using punched cards). Until COBOL 2002, these were: In COBOL 2002, Areas A and B were merged to form the program-text area, which now ends at an implementor-defined column. COBOL 2002 also introduced free-format code. Free-format code can be placed in any column of the file, as in newer programming languages. Comments are specified using *>, which can be placed anywhere and can also be used in fixed-format source code. Continuation lines are not present, and the >>PAGE directive replaces the / indicator.


Identification division

The identification division identifies the following code entity and contains the definition of a class or interface.


Object-oriented programming

Classes and interfaces have been in COBOL since 2002. Classes have factory objects, containing class methods and variables, and instance objects, containing instance methods and variables. Inheritance and interfaces provide polymorphism. Support for generic programming is provided through parameterized classes, which can be instantiated to use any class or interface. Objects are stored as references which may be restricted to a certain type. There are two ways of calling a method: the statement, which acts similarly to , or through inline method invocation, which is analogous to using functions. *> These are equivalent. INVOKE my-class "foo" RETURNING var MOVE my-class::"foo" TO var *> Inline method invocation COBOL does not provide a way to hide methods. Class data can be hidden, however, by declaring it without a clause, which leaves external code no way to access it. Method overloading was added in COBOL 2014.


Environment division

The environment division contains the configuration section and the input-output section. The configuration section is used to specify variable features such as currency signs, locales and character sets. The input-output section contains file-related information.


Files

COBOL supports three file formats, or ': sequential, indexed and relative. In sequential files, records are contiguous and must be traversed sequentially, similarly to a
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
. Indexed files have one or more indexes which allow records to be randomly accessed and which can be sorted on them. Each record must have a unique key, but other, ', record keys need not be unique. Implementations of indexed files vary between vendors, although common implementations, such as C-ISAM and VSAM, are based on IBM's
ISAM Indexed Sequential Access Method (ISAM) is a method for creating, maintaining, and manipulating computer files of data so that records can be retrieved sequentially or randomly by one or more keys. Indexes of key fields are maintained to achieve ...
. Other implementations are Record Management Services on
OpenVMS OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Op ...
and Enscribe on HPE NonStop (Tandem). Relative files, like indexed files, have a unique record key, but they do not have alternate keys. A relative record's key is its ordinal position; for example, the 10th record has a key of 10. This means that creating a record with a key of 5 may require the creation of (empty) preceding records. Relative files also allow for both sequential and random access. A common non-standard extension is the ' organization, used to process text files. Records in a file are terminated by a
newline A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or ...
and may be of varying length.


Data division

The data division is split into six sections which declare different items: the file section, for file records; the working-storage section, for
static variable In computer programming, a static variable is a variable that has been allocated "statically", meaning that its lifetime (or "extent") is the entire run of the program. This is in contrast to shorter-lived automatic variables, whose storage is ...
s; the local-storage section, for
automatic variable In computer programming, an automatic variable is a local variable which is allocated and deallocated automatically when program flow enters and leaves the variable's scope. The scope is the lexical context, particularly the function or block in ...
s; the linkage section, for parameters and the return value; the report section and the screen section, for
text-based user interface In computing, text-based user interfaces (TUI) (alternately terminal user interfaces, to reflect a dependence upon the properties of computer terminals and not just text), is a retronym describing a type of user interface (UI) common as an ear ...
s.


Aggregated data

Data items in COBOL are declared hierarchically through the use of level-numbers which indicate if a data item is part of another. An item with a higher level-number is subordinate to an item with a lower one. Top-level data items, with a level-number of 1, are called '. Items that have subordinate aggregate data are called '; those that do not are called '. Level-numbers used to describe standard data items are between 1 and 49. 01 some-record. *> Aggregate group record item 05 num PIC 9(10). *> Elementary item 05 the-date. *> Aggregate (sub)group record item 10 the-year PIC 9(4). *> Elementary item 10 the-month PIC 99. *> Elementary item 10 the-day PIC 99. *> Elementary item In the above example, elementary item and group item are subordinate to the record , while elementary items , , and are part of the group item . Subordinate items can be disambiguated with the (or ) keyword. For example, consider the example code above along with the following example: 01 sale-date. 05 the-year PIC 9(4). 05 the-month PIC 99. 05 the-day PIC 99. The names , , and are ambiguous by themselves, since more than one data item is defined with those names. To specify a particular data item, for instance one of the items contained within the group, the programmer would use (or the equivalent ). This syntax is similar to the "dot notation" supported by most contemporary languages.


Other data levels

A level-number of 66 is used to declare a re-grouping of previously defined items, irrespective of how those items are structured. This data level, also referred to by the associated , is rarely used and, circa 1988, was usually found in old programs. Its ability to ignore the hierarchical and logical structure data meant its use was not recommended and many installations forbade its use. 01 customer-record. 05 cust-key PIC X(10). 05 cust-name. 10 cust-first-name PIC X(30). 10 cust-last-name PIC X(30). 05 cust-dob PIC 9(8). 05 cust-balance PIC 9(7)V99. 66 cust-personal-details RENAMES cust-name THRU cust-dob. 66 cust-all-details RENAMES cust-name THRU cust-balance. A 77 level-number indicates the item is stand-alone, and in such situations is equivalent to the level-number 01. For example, the following code declares two 77-level data items, and , which are non-group data items that are independent of (not subordinate to) any other data items: 77 property-name PIC X(80). 77 sales-region PIC 9(5). An 88 level-number declares a ' (a so-called 88-level) which is true when its parent data item contains one of the values specified in its clause. For example, the following code defines two 88-level condition-name items that are true or false depending on the current character data value of the data item. When the data item contains a value of , the condition-name is true, whereas when it contains a value of or , the condition-name is true. If the data item contains some other value, both of the condition-names are false. 01 wage-type PIC X. 88 wage-is-hourly VALUE "H". 88 wage-is-yearly VALUE "S", "Y".


Data types

Standard COBOL provides the following data types: Type safety is variable in COBOL. Numeric data is converted between different representations and sizes silently and alphanumeric data can be placed in any data item that can be stored as a string, including numeric and group data. In contrast, object references and pointers may only be assigned from items of the same type and their values may be restricted to a certain type.


=PICTURE clause

= A (or ) clause is a string of characters, each of which represents a portion of the data item and what it may contain. Some picture characters specify the type of the item and how many characters or digits it occupies in memory. For example, a indicates a decimal digit, and an indicates that the item is signed. Other picture characters (called ' and ' characters) specify how an item should be formatted. For example, a series of characters define character positions as well as how a leading sign character is to be positioned within the final character data; the rightmost non-numeric character will contain the item's sign, while other character positions corresponding to a to the left of this position will contain a space. Repeated characters can be specified more concisely by specifying a number in parentheses after a picture character; for example, is equivalent to . Picture specifications containing only digit () and sign () characters define purely ' data items, while picture specifications containing alphabetic () or alphanumeric () characters define ' data items. The presence of other formatting characters define ' or ' data items.


=USAGE clause

= The clause declares the format in which data is stored. Depending on the data type, it can either complement or be used instead of a clause. While it can be used to declare pointers and object references, it is mostly geared towards specifying numeric types. These numeric formats are: * Binary, where a minimum size is either specified by the PICTURE clause or by a USAGE clause such as BINARY-LONG * , where data may be stored in whatever format the implementation provides; often equivalent to   * , the default format, where data is stored as a string * Floating-point, in either an implementation-dependent format or according to IEEE 754 * , where data is stored as a string using an extended character set * , where data is stored in the smallest possible decimal format (typically packed binary-coded decimal)


Report writer

The report writer is a declarative facility for creating reports. The programmer need only specify the report layout and the data required to produce it, freeing them from having to write code to handle things like page breaks, data formatting, and headings and footings. Reports are associated with report files, which are files which may only be written to through report writer statements. FD report-out REPORT sales-report. Each report is defined in the report section of the data division. A report is split into report groups which define the report's headings, footings and details. Reports work around hierarchical '. Control breaks occur when a key variable changes it value; for example, when creating a report detailing customers' orders, a control break could occur when the program reaches a different customer's orders. Here is an example report description for a report which gives a salesperson's sales and which warns of any invalid records: RD sales-report PAGE LIMITS 60 LINES FIRST DETAIL 3 CONTROLS seller-name. 01 TYPE PAGE HEADING. 03 COL 1 VALUE "Sales Report". 03 COL 74 VALUE "Page". 03 COL 79 PIC Z9 SOURCE PAGE-COUNTER. 01 sales-on-day TYPE DETAIL, LINE + 1. 03 COL 3 VALUE "Sales on". 03 COL 12 PIC 99/99/9999 SOURCE sales-date. 03 COL 21 VALUE "were". 03 COL 26 PIC $$$$9.99 SOURCE sales-amount. 01 invalid-sales TYPE DETAIL, LINE + 1. 03 COL 3 VALUE "INVALID RECORD:". 03 COL 19 PIC X(34) SOURCE sales-record. 01 TYPE CONTROL HEADING seller-name, LINE + 2. 03 COL 1 VALUE "Seller:". 03 COL 9 PIC X(30) SOURCE seller-name. The above report description describes the following layout:
Sales Report                                                             Page  1

Seller: Howard Bromberg
  Sales on 10/12/2008 were $1000.00
  Sales on 12/12/2008 were    $0.00
  Sales on 13/12/2008 were   $31.47
  INVALID RECORD: Howard Bromberg             XXXXYY

Seller: Howard Discount
...
Sales Report                                                            Page 12

  Sales on 08/05/2014 were  $543.98
  INVALID RECORD: William Selden      12052014FOOFOO
  Sales on 30/05/2014 were    $0.00
Four statements control the report writer: , which prepares the report writer for printing; , which prints a report group; , which suppresses the printing of a report group; and , which terminates report processing. For the above sales report example, the procedure division might look like this: OPEN INPUT sales, OUTPUT report-out INITIATE sales-report PERFORM UNTIL 1 <> 1 READ sales AT END EXIT PERFORM END-READ VALIDATE sales-record IF valid-record GENERATE sales-on-day ELSE GENERATE invalid-sales END-IF END-PERFORM TERMINATE sales-report CLOSE sales, report-out . Use of the Report Writer facility tends to vary considerably; some organizations use it extensively and some not at all. In addition, implementations of Report Writer ranged in quality, with those at the lower end sometimes using excessive amounts of memory at runtime.


Procedure division


Procedures

The sections and paragraphs in the procedure division (collectively called procedures) can be used as labels and as simple subroutines. Unlike in other divisions, paragraphs do not need to be in sections. Execution goes down through the procedures of a program until it is terminated. To use procedures as subroutines, the verb is used. A statement somewhat resembles a procedure call in a newer languages in the sense that execution returns to the code following the statement at the end of the called code; however, it does not provide a mechanism for parameter passing or for returning a result value. If a subroutine is invoked using a simple statement like , then control returns at the end of the called procedure. However, is unusual in that it may be used to call a range spanning a sequence of several adjacent procedures. This is done with the construct: PROCEDURE so-and-so. PERFORM ALPHA PERFORM ALPHA THRU GAMMA STOP RUN. ALPHA. DISPLAY 'A'. BETA. DISPLAY 'B'. GAMMA. DISPLAY 'C'. The output of this program will be: "A A B C". also differs from conventional procedure calls in that there is, at least traditionally, no notion of a call stack. As a consequence, nested invocations are possible (a sequence of code being 'ed may execute a statement itself), but require extra care if parts of the same code are executed by both invocations. The problem arises when the code in the inner invocation reaches the exit point of the outer invocation. More formally, if control passes through the exit point of a invocation that was called earlier but has not yet completed, the COBOL 2002 standard stipulates that the behavior is undefined. The reason is that COBOL, rather than a "return address", operates with what may be called a continuation address. When control flow reaches the end of any procedure, the continuation address is looked up and control is transferred to that address. Before the program runs, the continuation address for every procedure is initialized to the start address of the procedure that comes next in the program text so that, if no statements happen, control flows from top to bottom through the program. But when a statement executes, it modifies the continuation address of the called procedure (or the last procedure of the called range, if was used), so that control will return to the call site at the end. The original value is saved and is restored afterwards, but there is only one storage position. If two nested invocations operate on overlapping code, they may interfere which each other's management of the continuation address in several ways. The following example (taken from ) illustrates the problem: LABEL1. DISPLAY '1' PERFORM LABEL2 THRU LABEL3 STOP RUN. LABEL2. DISPLAY '2' PERFORM LABEL3 THRU LABEL4. LABEL3. DISPLAY '3'. LABEL4. DISPLAY '4'. One might expect that the output of this program would be "1 2 3 4 3": After displaying "2", the second causes "3" and "4" to be displayed, and then the first invocation continues on with "3". In traditional COBOL implementations, this is not the case. Rather, the first statement sets the continuation address at the end of so that it will jump back to the call site inside . The second statement sets the return at the end of but does not modify the continuation address of , expecting it to be the default continuation. Thus, when the inner invocation arrives at the end of , it jumps back to the outer statement, and the program stops having printed just "1 2 3". On the other hand, in some COBOL implementations like the open-source TinyCOBOL compiler, the two statements do not interfere with each other and the output is indeed "1 2 3 4 3". Therefore, the behavior in such cases is not only (perhaps) surprising, it is also not portable. A special consequence of this limitation is that cannot be used to write recursive code. Another simple example to illustrate this (slightly simplified from ): MOVE 1 TO A PERFORM LABEL STOP RUN. LABEL. DISPLAY A IF A < 3 ADD 1 TO A PERFORM LABEL END-IF DISPLAY 'END'. One might expect that the output is "1 2 3 END END END", and in fact that is what some COBOL compilers will produce. But other compilers, like IBM COBOL, will produce code that prints "1 2 3 END END END END ..." and so on, printing "END" over and over in an endless loop. Since there is limited space to store backup continuation addresses, the backups get overwritten in the course of recursive invocations, and all that can be restored is the jump back to .


Statements

COBOL 2014 has 47 statements (also called '), which can be grouped into the following broad categories: control flow, I/O, data manipulation and the report writer. The report writer statements are covered in the report writer section.


=Control flow

= COBOL's conditional statements are and . is a switch-like statement with the added capability of evaluating multiple values and conditions. This can be used to implement decision tables. For example, the following might be used to control a CNC lathe: EVALUATE TRUE ALSO desired-speed ALSO current-speed WHEN lid-closed ALSO min-speed THRU max-speed ALSO LESS THAN desired-speed PERFORM speed-up-machine WHEN lid-closed ALSO min-speed THRU max-speed ALSO GREATER THAN desired-speed PERFORM slow-down-machine WHEN lid-open ALSO ANY ALSO NOT ZERO PERFORM emergency-stop WHEN OTHER CONTINUE END-EVALUATE The statement is used to define loops which are executed a condition is true (not true, which is more common in other languages). It is also used to call procedures or ranges of procedures (see the procedures section for more details). and call subprograms and methods, respectively. The name of the subprogram/method is contained in a string which may be a literal or a data item. Parameters can be passed by reference, by content (where a copy is passed by reference) or by value (but only if a
prototype A prototype is an early sample, model, or release of a product built to test a concept or process. It is a term used in a variety of contexts, including semantics, design, electronics, and Software prototyping, software programming. A prototype ...
is available). unloads subprograms from memory. causes the program to jump to a specified procedure. The statement is a return statement and the statement stops the program. The statement has six different formats: it can be used as a return statement, a break statement, a continue statement, an end marker or to leave a procedure. Exceptions are raised by a statement and caught with a handler, or ', defined in the portion of the procedure division. Declaratives are sections beginning with a statement which specify the errors to handle. Exceptions can be names or objects. is used in a declarative to jump to the statement after the one that raised the exception or to a procedure outside the . Unlike other languages, uncaught exceptions may not terminate the program and the program can proceed unaffected.


=I/O

= File I/O is handled by the self-describing , , , and statements along with a further three: , which updates a record; , which selects subsequent records to access by finding a record with a certain key; and , which releases a
lock Lock(s) or Locked may refer to: Common meanings *Lock and key, a mechanical device used to secure items of importance *Lock (water navigation), a device for boats to transit between different levels of water, as in a canal Arts and entertainme ...
on the last record accessed. User interaction is done using and .


=Data manipulation

= The following verbs manipulate data: * , which sets data items to their default values. * , which assigns values to data items ; ''MOVE CORRESPONDING'' assigns corresponding like-named fields. * , which has 15 formats: it can modify indices, assign object references and alter table capacities, among other functions. * , , , , and , which handle arithmetic (with assigning the result of a formula to a variable). * and , which handle dynamic memory. * , which validates and distributes data as specified in an item's description in the data division. * and , which concatenate and split strings, respectively. * , which tallies or replaces instances of specified substrings within a string. * , which searches a table for the first entry satisfying a condition. Files and tables are sorted using and the verb merges and sorts files. The verb provides records to sort and retrieves sorted records in order.


Scope termination

Some statements, such as and , may themselves contain statements. Such statements may be terminated in two ways: by a period ('), which terminates ''all'' unterminated statements contained, or by a scope terminator, which terminates the nearest matching open statement. *> Terminator period ("implicit termination") IF invalid-record IF no-more-records NEXT SENTENCE ELSE READ record-file AT END SET no-more-records TO TRUE. *> Scope terminators ("explicit termination") IF invalid-record IF no-more-records CONTINUE ELSE READ record-file AT END SET no-more-records TO TRUE END-READ END-IF END-IF Nested statements terminated with a period are a common source of bugs. For example, examine the following code: IF x DISPLAY y. DISPLAY z. Here, the intent is to display y and z if condition x is true. However, z will be displayed whatever the value of x because the IF statement is terminated by an erroneous period after . Another bug is a result of the dangling else problem, when two IF statements can associate with an ELSE. IF x IF y DISPLAY a ELSE DISPLAY b. In the above fragment, the ELSE associates with the    statement instead of the    statement, causing a bug. Prior to the introduction of explicit scope terminators, preventing it would require    to be placed after the inner IF.


Self-modifying code

The original (1959) COBOL specification supported the infamous    statement, for which many compilers generated
self-modifying code In computer science, self-modifying code (SMC or SMoC) is source code, code that alters its own instruction (computer science), instructions while it is execution (computing), executing – usually to reduce the instruction path length and imp ...
. X and Y are procedure labels, and the single    statement in procedure X executed after such an statement means    instead. Many compilers still support it, but it was deemed
obsolete Obsolescence is the process of becoming antiquated, out of date, old-fashioned, no longer in general use, or no longer useful, or the condition of being in such a state. When used in a biological sense, it means imperfect or rudimentary when comp ...
in the COBOL 1985 standard and deleted in 2002. The statement was poorly regarded because it undermined "locality of context" and made a program's overall logic difficult to comprehend. As textbook author Daniel D. McCracken wrote in 1976, when "someone who has never seen the program before must become familiar with it as quickly as possible, sometimes under critical time pressure because the program has failed ... the sight of a GO TO statement in a paragraph by itself, signaling as it does the existence of an unknown number of ALTER statements at unknown locations throughout the program, strikes fear in the heart of the bravest programmer."


Hello, world

A
"Hello, World!" program A "Hello, World!" program is usually a simple computer program that emits (or displays) to the screen (often the Console application, console) a message similar to "Hello, World!". A small piece of code in most general-purpose programming languag ...
in COBOL: IDENTIFICATION DIVISION. PROGRAM-ID. hello-world. PROCEDURE DIVISION. DISPLAY "Hello, world!" . When the now famous "Hello, World!" program example in ''
The C Programming Language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the C programming langu ...
'' was first published in 1978 a similar mainframe COBOL program sample would have been submitted through JCL, very likely using a punch card reader, and 80 column punch cards. The listing below, ''with an empty DATA DIVISION'', was tested using Linux and the System/370 Hercules emulator running MVS 3.8J. The JCL, written in July 2015, is derived from the Hercules tutorials and samples hosted by Jay Moseley. In keeping with COBOL programming of that era, HELLO, WORLD is displayed in all capital letters. //COBUCLG JOB (001),'COBOL BASE TEST', 00010000 // CLASS=A,MSGCLASS=A,MSGLEVEL=(1,1) 00020000 //BASETEST EXEC COBUCLG 00030000 //COB.SYSIN DD * 00040000 00000* VALIDATION OF BASE COBOL INSTALL 00050000 01000 IDENTIFICATION DIVISION. 00060000 01100 PROGRAM-ID. 'HELLO'. 00070000 02000 ENVIRONMENT DIVISION. 00080000 02100 CONFIGURATION SECTION. 00090000 02110 SOURCE-COMPUTER. GNULINUX. 00100000 02120 OBJECT-COMPUTER. HERCULES. 00110000 02200 SPECIAL-NAMES. 00120000 02210 CONSOLE IS CONSL. 00130000 03000 DATA DIVISION. 00140000 04000 PROCEDURE DIVISION. 00150000 04100 00-MAIN. 00160000 04110 DISPLAY 'HELLO, WORLD' UPON CONSL. 00170000 04900 STOP RUN. 00180000 //LKED.SYSLIB DD DSNAME=SYS1.COBLIB,DISP=SHR 00190000 // DD DSNAME=SYS1.LINKLIB,DISP=SHR 00200000 //GO.SYSPRINT DD SYSOUT=A 00210000 // 00220000 After submitting the JCL, the MVS console displayed: 19.52.48 JOB 3 $HASP100 COBUCLG ON READER1 COBOL BASE TEST 19.52.48 JOB 3 IEF677I WARNING MESSAGE(S) FOR JOB COBUCLG ISSUED 19.52.48 JOB 3 $HASP373 COBUCLG STARTED - INIT 1 - CLASS A - SYS BSP1 19.52.48 JOB 3 IEC130I SYSPUNCH DD STATEMENT MISSING 19.52.48 JOB 3 IEC130I SYSLIB DD STATEMENT MISSING 19.52.48 JOB 3 IEC130I SYSPUNCH DD STATEMENT MISSING 19.52.48 JOB 3 IEFACTRT - Stepname Procstep Program Retcode 19.52.48 JOB 3 COBUCLG BASETEST COB IKFCBL00 RC= 0000 19.52.48 JOB 3 COBUCLG BASETEST LKED IEWL RC= 0000 19.52.48 JOB 3 +HELLO, WORLD 19.52.48 JOB 3 COBUCLG BASETEST GO PGM=*.DD RC= 0000 19.52.48 JOB 3 $HASP395 COBUCLG ENDED ''Line 10 of the console listing above is highlighted for effect, the highlighting is not part of the actual console output''. The associated compiler listing generated over four pages of technical detail and job run information, for the single line of output from the 14 lines of COBOL.


Reception


Lack of structure

In the 1970s, adoption of the
structured programming Structured programming is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by making specific disciplined use of the structured control flow constructs of selection ( if/then/else) and repet ...
paradigm was becoming increasingly widespread. Edsger Dijkstra, a preeminent computer scientist, wrote a
letter to the editor A letter to the editor (LTE) is a Letter (message), letter sent to a publication about an issue of concern to the reader. Usually, such letters are intended for publication. In many publications, letters to the editor may be sent either through ...
of
Communications of the ACM ''Communications of the ACM'' (''CACM'') is the monthly journal of the Association for Computing Machinery (ACM). History It was established in 1958, with Saul Rosen as its first managing editor. It is sent to all ACM members. Articles are i ...
, published in 1975 entitled "How do we tell truths that might hurt?", in which he was critical of COBOL and several other contemporary languages; remarking that "the use of COBOL cripples the mind". In a published dissent to Dijkstra's remarks, the computer scientist Howard E. Tompkins claimed that unstructured COBOL tended to be "written by programmers that have never had the benefit of structured COBOL taught well", arguing that the issue was primarily one of training. One cause of spaghetti code was the statement. Attempts to remove s from COBOL code, however, resulted in convoluted programs and reduced code quality. s were largely replaced by the statement and procedures, which promoted
modular programming Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...
and gave easy access to powerful looping facilities. However, could be used only with procedures so loop bodies were not located where they were used, making programs harder to understand. COBOL programs were infamous for being monolithic and lacking modularization. COBOL code could be modularized only through procedures, which were found to be inadequate for large systems. It was impossible to restrict access to data, meaning a procedure could access and modify data item. Furthermore, there was no way to pass
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s to a procedure, an omission Jean Sammet regarded as the committee's biggest mistake. Another complication stemmed from the ability to a specified sequence of procedures. This meant that control could jump to and return from any procedure, creating convoluted control flow and permitting a programmer to break the single-entry single-exit rule. This situation improved as COBOL adopted more features. COBOL-74 added subprograms, giving programmers the ability to control the data each part of the program could access. COBOL-85 then added nested subprograms, allowing programmers to hide subprograms. Further control over data and code came in 2002 when object-oriented programming, user-defined functions and user-defined data types were included. Nevertheless, much important legacy COBOL software uses unstructured code, which has become practically unmaintainable. It can be too risky and costly to modify even a simple section of code, since it may be used from unknown places in unknown ways.


Compatibility issues

COBOL was intended to be a highly portable, "common" language. However, by 2001, around 300 dialects had been created. One source of dialects was the standard itself: the 1974 standard was composed of one mandatory nucleus and eleven functional modules, each containing two or three levels of support. This permitted 104,976 possible variants. COBOL-85 was not fully compatible with earlier versions, and its development was controversial. Joseph T. Brophy, the CIO of Travelers Insurance, spearheaded an effort to inform COBOL users of the heavy reprogramming costs of implementing the new standard. As a result, the ANSI COBOL Committee received more than 2,200 letters from the public, mostly negative, requiring the committee to make changes. On the other hand, conversion to COBOL-85 was thought to increase productivity in future years, thus justifying the conversion costs.


Verbose syntax

COBOL syntax has often been criticized for its verbosity. Proponents say that this was intended to make the code self-documenting, easing program maintenance. COBOL was also intended to be easy for programmers to learn and use, while still being readable to non-technical staff such as managers. The desire for readability led to the use of English-like syntax and structural elements, such as nouns, verbs, clauses, sentences, sections, and divisions. Yet by 1984, maintainers of COBOL programs were struggling to deal with "incomprehensible" code and the main changes in COBOL-85 were there to help ease maintenance. Jean Sammet, a short-range committee member, noted that "little attempt was made to cater to the professional programmer, in fact people whose main interest is programming tend to be very unhappy with COBOL" which she attributed to COBOL's verbose syntax. Later, COBOL suffered from a shortage of material covering it; it took until 1963 for introductory books to appear (with Richard D. Irwin publishing a college textbook on COBOL in 1966). Donald Nelson, chair of the CODASYL COBOL committee, said in 1984 that "academics ... hate COBOL" and that computer science graduates "had 'hate COBOL' drilled into them". By the mid-1980s, there was also significant condescension towards COBOL in the business community from users of other languages, for example FORTRAN or assembler, implying that COBOL could be used only for non-challenging problems. In 2003, COBOL featured in 80% of
information systems An information system (IS) is a formal, sociotechnical, organizational system designed to collect, process, store, and distribute information. From a sociotechnical perspective, information systems comprise four components: task, people, structu ...
curricula in the United States, the same proportion as C++ and
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
. Ten years later, a poll by Micro Focus found that 20% of university academics thought COBOL was outdated or dead and that 55% believed their students thought COBOL was outdated or dead. The same poll also found that only 25% of academics had COBOL programming on their curriculum even though 60% thought they should teach it.


Concerns about the design process

Doubts have been raised about the competence of the standards committee. Short-term committee member Howard Bromberg said that there was "little control" over the development process and that it was "plagued by discontinuity of personnel and ... a lack of talent." Jean Sammet and Jerome Garfunkel also noted that changes introduced in one revision of the standard would be reverted in the next, due as much to changes in who was in the standard committee as to objective evidence. COBOL standards have repeatedly suffered from delays: COBOL-85 arrived five years later than hoped, COBOL 2002 was five years late, and COBOL 2014 was six years late. To combat delays, the standard committee allowed the creation of optional addenda which would add features more quickly than by waiting for the next standard revision. However, some committee members raised concerns about incompatibilities between implementations and frequent modifications of the standard.


Influences on other languages

COBOL's data structures influenced subsequent programming languages. Its record and file structure influenced
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
and Pascal, and the REDEFINES clause was a predecessor to Pascal's variant records. Explicit file structure definitions preceded the development of
database management systems In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and ana ...
and aggregated data was a significant advance over Fortran's arrays. PICTURE data declarations were incorporated into PL/I, with minor changes. COBOL's facility, although considered "primitive", influenced the development of
include directive An include directive instructs a text file processor to replace the directive text with the content of a specified file. The act of including may be logical in nature. The processor may simply process the include file content at the location of ...
s. The focus on portability and standardization meant programs written in COBOL could be portable and facilitated the spread of the language to a wide variety of hardware platforms and operating systems.This can be seen in: * * * Additionally, the well-defined division structure restricts the definition of external references to the Environment Division, which simplifies platform changes in particular.


See also

* Alphabetical list of programming languages * BLIS/COBOL * CODASYL * Comparison of programming languages * *


Notes


References


Citations


Sources

* * * * * * * * * (Link goes to draft N 0147) * * * * * * * * * * * * *


External links

* *
COBOL Language Standard
(1991; COBOL-85 with Amendment 1), from
The Open Group The Open Group is a global consortium that seeks to "enable the achievement of business objectives" by developing " open, vendor-neutral technology standards and certifications." It has 900+ member organizations and provides a number of services ...
{{DEFAULTSORT:Cobol .NET programming languages 1959 software Class-based programming languages Computer-related introductions in 1959 Cross-platform software Object-oriented programming languages Procedural programming languages Programming languages created by women Programming languages created in 1959 Programming languages with an ISO standard Statically typed programming languages Structured programming languages