HOME

TheInfoList



OR:

The Sort/Merge
utility As a topic of economics, utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness as part of the theory of utilitarianism by moral philosopher ...
is a mainframe program to sort records in a file into a specified order, merge pre-sorted files into a sorted file, or copy selected records. Internally, these utilities use one or more of the standard
sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a List (computing), list into an Total order, order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending. ...
s, often with proprietary fine-tuned code. Mainframes were originally supplied with limited
main memory Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processing unit (CPU) of a computer ...
by today's standards and the amount of data to be sorted was frequently very large. Because of this, unlike more recent sort programs, early Sort/Merge programs placed great emphasis on efficient techniques for sorting data on
secondary storage Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processing unit (CPU) of a computer ...
, typically tape or
disk Disk or disc may refer to: * Disk (mathematics), a geometric shape * Disk storage Music * Disc (band), an American experimental music band * ''Disk'' (album), a 1995 EP by Moby Other uses * Disk (functional analysis), a subset of a vector sp ...
. In 1968 the OS/360 Sort/Merge program provided five different "sequence distribution techniques" that could be used depending on the number and type of devices available. Prior to the
System/370 The IBM System/370 (S/370) is a model range of IBM mainframe computers announced on June 30, 1970, as the successors to the System/360 family. The series mostly maintains backward compatibility with the S/360, allowing an easy migration path f ...
, all IBM mainframe operating systems included sort/merge utilities. With the announcement of virtual storage operating systems,
DOS/VS Disk Operating System/360, also DOS/360, or simply DOS, is the discontinued first member of a sequence of operating systems for IBM System/360, System/370 and later mainframes. It was announced by IBM on the last day of 1964, and it was first d ...
and
OS/VS OS/VS may refer to one of a number of IBM operating systems for System/370 and successors characterized by the use of virtual storage (VS): * IBM OS/VS1, successor to MFT II, 1972-1984 * IBM OS/VS2, successor to MVT, including: ** OS/VS2 version ...
, IBM unbundled much of the software and offered chargeable sort/merge program products. For OS/VS IBM offered 5740-SM1, OS/VS Sort/Merge, later renamed Data Facility Sort (DFSORT). In 1990 IBM introduced a new merge algorithm called BLOCKSET in DFSORT the successor to OS/360 Sort/Merge. Of historical note, the BLOCKSET algorithm was invented by an IBM Systems Engineer in 1963 and was discovered in IBM's archives and implemented in 1990. Sort/Merge is very frequently used; often the most commonly used application program in a mainframe shop generally consuming about twenty percent of the processing power of the shop. Modern Sort/Merge programs also can copy files, select or omit certain records, summarize records, remove duplicates, reformat records, append new data and produce reports. Indeed, most Sort/Merge applications use the wide range of additional processing capabilities, rather than purely sorting or merging records: the Sort/Merge product is a very fast way of performing input to and output from these functions. Quite a number of "user exits" are supported, and these may be load modules (i.e., a member of a library), or object decks (i.e., the output of an assembler), with the Sort/Merge application loading (load modules) or linking (object decks; termed "dynamic link editing" in DFSORT) the exit, as specified and required. Working storage datasets (i.e., SORTWK01, ..., SORTWKnn) may be disk or tape, although the BLOCKSET algorithm is restricted to disk working storage; more working storage datasets generally improves performance. Sort/merge is important enough that there are multiple companies each selling their own sort/merge package for
IBM mainframes IBM mainframes are large computer systems produced by IBM since 1952. During the 1960s and 1970s, IBM dominated the large computer market. Current mainframe computers in IBM's line of business computers are developments of the basic design of th ...
and their
z/OS z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions.Starting with the earliest: * O ...
,
z/VM z/VM is the current version in IBM's VM family of virtual machine operating systems. z/VM was first released in October 2000 and remains in active use and development . It is directly based on technology and concepts dating back to the 1960s, wi ...
and
z/VSE VSEn (''Virtual Storage Extended'') is an operating system for IBM mainframe computers, the latest one in the DOS/360 lineage, which originated in 1965. DOS/VSE was introduced in 1979 as a successor to DOS/VS; in turn, DOS/VSE was succeeded by ...
operating systems An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also inc ...
. These programs are largely compatible with IBM's SORT programs, often with some extensions. The major Sort/Merge packages are: * DFSORT sold by IBM. * SyncSort sold by Syncsort, Inc. * CA-Sort sold by
CA Technologies CA Technologies, formerly known as CA, Inc. and Computer Associates International, Inc., is an American multinational corporation headquartered in New York City. It is primarily known for its business-to-business (B2B) software with a product po ...
. (Some of these companies also sell versions for other platforms, such as
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
,
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
, or
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
.) Sort/Merge is a critical component of many mainframe environments. When migrating from the mainframe to other platforms such as
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
,
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
or
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
, a Sort/Merge utility is needed. MFSORT from
Micro Focus Micro Focus International plc is a British multinational software and information technology business based in Newbury, Berkshire, England. The firm provides software and consultancy. The company is listed on the London Stock Exchange and is ...
and AHLSORT. These products emulate the functions of DFSORT outside of the Mainframe environment. Historically, the "alias" SORT has been used to refer to an installation's preferred sort program, IBM's Sort/Merge, and third party Sort/Merge programs (i.e., SYNCSORT, CASORT). DFSORT is often referred to by its program name, ICEMAN (component ICE; the original OS/360 Sort/Merge program name was IERRCO00, component IER, also with "alias" SORT).


IBM OS/360 SORT

Prior to virtual storage operating systems, "The input data set asalmost always too large to be brought into main storage and sorted all at once." SORT used a ''replacement selection technique'' to reduce storage usage. The program placed emphasis on ''sequence distribution techniques'', which could be defaulted depending on the number and type of devices available, or could be specified by the user, for making best use of secondary storage "sort work" (SORTWK) files. These techniques were methods of distributing partially sorted sequences of records most efficiently. There were five distribution techniques available to the OS/360 SORT: * Magnetic tape techniques ** Balanced (BALN) - required a minimum of 12,000 bytes of main storage and 2x+1 tape devices for intermediate storage, where ''x'' is the number of input tape volumes, up to a maximum of 15 input reels. ** Polyphase (POLY) - required a minimum of 12,000 bytes and 3 intermediate storage tape devices. Only one input reel was allowed. ** Oscillating (OSCL) - required 21,000 bytes and max(x+2,4) intermediate tape devices, where ''x'' is the number of input volumes, up to a maximum of 15. * Direct access techniques ** Balanced (BALN) - required a minimum of 13,000 bytes and 3 to 6 disk work areas. The maximum number of records that could be sorted depended on the main and auxiliary storage available. **Crisscross (CRCX) - Not available for
IBM 2311 IBM manufactured magnetic disk storage devices from 1956 to 2003, when it sold its hard disk drive business to Hitachi. Both the hard disk drive (HDD) and floppy disk drive (FDD) were invented by IBM and as such IBM's employees were responsible fo ...
or IBM 2301 auxiliary storage devices. Required a minimum of 24,000 bytes of main storage and 6 to 17 auxiliary storage workareas. The maximum number of records that could be sorted depended on the main and auxiliary storage available.


IBM OS/VS SORT

The distribution techniques listed for tape sorts were retained by the OS/VS SORT program, now called "conventional techniques." The disk sort techniques were replaced by four new ones: * FLR-Blockset for fixed length records * VLR-Blockset for variable-length records * Peerage for fixed length records * Vale for both fixed and variable-length records


See also

*
BatchPipes On IBM mainframes, BatchPipes is a batch job processing utility which runs under the MVS/ESA operating system and later versions—OS/390 and z/OS. Core function In traditional processing, if data records are written out to sequential (QSAM ...
*
External sort External sorting is a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead they must reside in t ...


Notes


References

{{Reflist


External links


IBM DFSORT ManualsSome basic DFSORT and SyncSort examples
Mainframe utility programs Data processing