360 Object File Format
   HOME

TheInfoList



OR:

The OS/360 Object File Format is the standard object module file format for the IBM
DOS/360 Disk Operating System/360, also DOS/360, or simply DOS, is the discontinued first member of a sequence of operating systems for IBM System/360, System/370 and later mainframes. It was announced by IBM on the last day of 1964, and it was first d ...
, OS/360 and
VM/370 VM (often: VM/CMS) is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers. The following versions ...
, Univac
VS/9 VS/9 is a computer operating system for the UNIVAC Series 90 mainframes (90/60, 90/70, and 90/80), used during the late 1960s through 1980s. The 90/60 and 90/70 were repackaged Univac 9700 computers. After the RCA acquisition by Sperry, it was ...
, and
Fujitsu is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...
BS2000 mainframe operating systems. In the 1990s, the format was given an extension with the XSD-type record for the
MVS Multiple Virtual Storage, more commonly called MVS, was the most commonly used operating system on the System/370 and System/390 IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated ...
Operating System to support longer module names in the
C Programming Language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
. This format is still in use by the
z/VSE VSEn (''Virtual Storage Extended'') is an operating system for IBM mainframe computers, the latest one in the DOS/360 lineage, which originated in 1965. DOS/VSE was introduced in 1979 as a successor to DOS/VS; in turn, DOS/VSE was succeeded by ...
operating system (the follow-on to the
DOS/360 Disk Operating System/360, also DOS/360, or simply DOS, is the discontinued first member of a sequence of operating systems for IBM System/360, System/370 and later mainframes. It was announced by IBM on the last day of 1964, and it was first d ...
Operating System). In contrast, it has been superseded by the
GOFF Goff is a surname with several distinct origins, mainly Germanic, Celtic, Jewish, and French. It is the 946th most common family name in the United States. When the surname originates from England it is derived from an occupational name from G ...
file format on the
MVS Multiple Virtual Storage, more commonly called MVS, was the most commonly used operating system on the System/370 and System/390 IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated ...
Operating System (the follow-on to the OS/360 Operating System) and on the z/VM Operating System (the follow-on to the
VM/370 VM (often: VM/CMS) is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers. The following versions ...
Operating System). Since the
MVS Multiple Virtual Storage, more commonly called MVS, was the most commonly used operating system on the System/370 and System/390 IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated ...
and z/VM loaders will still handle this older format, some compilers have chosen to continue to produce this format instead of the newer
GOFF Goff is a surname with several distinct origins, mainly Germanic, Celtic, Jewish, and French. It is the 946th most common family name in the United States. When the surname originates from England it is derived from an occupational name from G ...
format.''IBM z/VSE System Control Statements: Version 5 Release 1'', SC34-2637-00, IBM, 1984, 2011.


Use

This format provides for the description of a compiled application's object code, which can be fed to a linkage editor to be made into an executable program, or run directly through an object module loader. It is created by the Assembler or by a programming language compiler. For the rest of this article, unless a reason for being explicit in the difference between a language compiler and an assembler is required, the term "compile" includes "assemble" and "compiler" includes "assembler."


Weaknesses

This format was considered adequate for the time it was originally developed, around 1964. Over time, it had a number of weaknesses, among which is that * it supports only 8-byte long names (and typically there is a convention that the names are UPPER CASE only, and are restricted to certain symbols in the name, see the discussion below). * alignment cannot be specified. * a module that is pure data and is not executable cannot be specified. * a reentrant module (as opposed to one merely read-only) cannot be specified. * cannot distinguish between a subroutine (a routine that handles data only through arguments) vs. a function (a routine that returns data through a return value). * a module designed so that it is movable (as opposed to merely reentrant) cannot be specified. * address constants can't be identified as pointers (such as for access to a data structure) as opposed to, say, access to a table (that is not changed) or to a virtual method in a dynamic record. * attributes cannot be assigned to external references (a reference is to code vs. a reference to data). * no means to allow procedures or functions to check or validate argument types or validate external structures. * no means to declare an object, where part of the structure is data and part is code (methods that operates upon the data of the object). * the SYM symbolic table is limited in the information it can provide. These and other weaknesses caused this format to be superseded by the
GOFF Goff is a surname with several distinct origins, mainly Germanic, Celtic, Jewish, and French. It is the 946th most common family name in the United States. When the surname originates from England it is derived from an occupational name from G ...
module file format. But, it was a good choice as it was satisfactory for the needs of programming languages being used at the time, it did work and was simple to implement (especially where machines at the time may have had as little as 8K of memory, many operating multiple concurrent or consecutive jobs with as little as 64K, and actually performing useful work), simple to use and for simple programs (object orientation and concepts like virtual methods would be decades in the future from when it was originally developed), can still be adequate. Also, the format is still satisfactory to continue to be used for older programs that either were never changed, or where the source code is unavailable and the object files are the only part of the program remaining. Note that the GOFF file format merely superseded this format (and provided more information for a language compiler or the assembler), the format is still valid, may still continue to be used, and was not deprecated. This format has the advantage that it is easy and simple to create, and a compiler for a language that can live with its restrictions, which are maximum 8-character upper-case only module names, applications no larger than 2^24 in size (16 megabytes) for code and data, means that any programming language that can write 80-byte fixed-format binary files (basically anything including COBOL and FORTRAN, not just Assembler), can be used to create a compiler for this object format. In fact, the Australian Atomic Energy Commission's Pascal 8000 Compiler for the IBM 360/370, itself written in Pascal as a self-hosting compiler back in 1978–1980, directly created its own object files without using the Assembler as an intermediate step.


Record Types

There are 6 different record types: * ESD records define main programs, subroutines, functions, dummy sections, Fortran Common, and any module or routine that can be called by another module. They are used to define the program(s) or program segments that were compiled in this execution of the compiler, and external routines used by the program (such as exit() in C, CALL EXIT in Fortran; new() and dispose() in
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Fren ...
). ESD records should occur before any reference to an ESD symbol. * TXT records contain the machine instructions or data which is held by the module. * RLD records are used to relocate addresses. For example, a program referencing an address located 500 bytes inside the module, will internally store the address as 500, but when the module is loaded into memory it's bound to be located someplace else, so an RLD record informs the linkage editor or loader what addresses to change. Also, when a module references an external symbol, it will usually set the value of the symbol to zero, then include an RLD entry for that symbol to allow the loader or linkage editor to alter the address to the correct value. * SYM records were added to allow for providing additional information about a symbol, such as the type of data (character or numeric) and the size of the item. * XSD records were added to provide additional information beyond that provided in the ESD record about public symbols such as procedures and functions, and to expand the size of a procedure or function name to more than 8 characters. * END records indicate the end of a module, and optionally where the program is to begin execution.


Format

All records are exactly 80 bytes long; unused fields should be blank-filled. The first byte of every record is always the binary value 02. The next 3 bytes are always the record type. Character values are in EBCDIC. The remainder of each record's fields are dependent on the record type. By convention, if the module was named in the TITLE statement of an assembly language program (or the language compiler decides to give the module a name), its name appears left-justified in positions 73–80 of each record; if the name is shorter than 8 characters or no name was given, a sequence number (in characters, right justified with zero fill) appears for the remainder of each record. In actual practice, the sequence number field may be blank or contain anything the language translator wants to put there, and is essentially a comment field. The assembler, (or compiler, in the case of a high-level language such as C,
COBOL COBOL (; an acronym for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural and, since 2002, object-oriented language. COBOL is primarily us ...
, Fortran,
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Fren ...
, PL/I or
RPG III RPG III is a dialect of the RPG programming language that was first announced with the IBM System/38 in 1978. An upgraded version, RPG IV, was introduced in 1994. In 2001 RPG was again updated to remove a number of column restrictions. RPG contin ...
), would create an ESD record for each subroutine, function, or program, and for Common Blocks in the case of Fortran programs. Additional ESD entries in ESD records would be created for ENTRY statements (an alias for a module or an alternative entry point for a module), for additional subroutines, functions or Fortran named or blank COMMON blocks included as part of a compiled or assembled modules, and for names of external subroutines and functions called by a module. Note that there are two kinds of public symbol types, ESDID entries and LDID entries. ESDID entries are CSECTS and DSECTS (Programs, Procedures and Functions, and possibly Record or Structure declarations) and LDID entries are ENTRY statements (alternative or alias entry points to a CSECT or DSECT). The ESDID numbering space is separate from the LDID numbering space, and thus two different named symbols, one an ESDID and one an LDID can both have the binary value of 0001. The program's executable object code and data would be stored in TXT records. Calls to other subroutines, functions or COMMON blocks are resolved through RLD records, which modify the address as stored in a TXT record to determine the complete address of the subroutine or function. Optionally, a language can provide symbolic reference information such as object names and data type information or debugging symbols through SYM records, and then the END statement indicates the end of an Object module file and the optional start address for the subroutine, function or program that this file should be started at, if the starting address for the routine is not the first byte of the first routine (some routines may have non-executable data preceding their actual code or the first routine assembled or compiled is not the "main" program or "primary" module.) As has been reported, some people discovered because of the way older assemblers worked (circa 1968–1975), a program compiled faster if you put data "above" a program before the code for the program, once the assembler started to notice instructions, it was much slower, so, programmers would write routines where they put the data and constants first, then included code for the program. When assembling a program could take 30 minutes to an hour instead of a few seconds as now, this was a big difference. Note that while not required, it is a convention that module and symbolic names are in all
upper case Letter case is the distinction between the letters that are in larger uppercase or capitals (or more formally ''majuscule'') and smaller lowercase (or more formally ''minuscule'') in the written representation of certain languages. The writing ...
, that the first character of a name field is a letter or the symbols @,# or $, and that subsequent characters of a name consist of those characters plus the character digits 0 through 9, although older software may or may not correctly process object module files which used lower-case identifiers. Most programming languages other than Assembly cannot call modules that have names containing @ or # (notably Fortran, which is why its run-time library has a name with a # in it so it would not conflict with any name chosen by a programmer), so most programs, subroutines, or functions were written to use only a letter for the first character, and if the name was longer than 1 character, to use only letters and digits for the 2nd through (up to) 8th character. While most non-assembler languages can't handle $ in the name, an exception is Fortran which can recognize subroutine names with $ in them. (Note that this choice not to use # @ or $ does not apply to a "main" program written in Assembler or any language that can use these identifiers, the program loader doesn't care what the name of the module is.) Also, modules written to be used as subroutines typically restricted themselves to 6 characters or less as versions of Fortran before about 1978 also can't use subroutines or modules using more than 6 characters in length. The COBOL compiler typically discards the dash character if it appears in a program's PROGRAM-ID or a CALL statement to an external module. In the 1990s, a new record type, the XSD record was added to extend the use of this object module format to encompass longer module names than 8 characters and to permit mixed-case names, as required by the C programming language.


General layout


ESD record


RLD relocation entries


SYM record


XSD record


END record


References

* ''OS/MVS Program Management: Advanced Facilities'', SA22-7644-07, Eighth Edition, IBM, Poughkeepsie, NY, Eighth Edition, September 2007 http://publibz.boulder.ibm.com/epubs/pdf/iea2b270.pdf (Retrieved August 9, 2013 * John R. Ehrman, ''How the Linkage Editor Works: A Tutorial on Object/Load Modules, Link Editors, Loaders, and What They Do for (and to) You'', IBM Silicon Valley (Santa Teresa) Laboratory, San Jose, 1994, 2001 ftp://ftp.boulder.ibm.com/software/websphere/awdtools/hlasm/s8169a.pdf (Retrieved July 29, 2013) {{DEFAULTSORT:OS 360 Object File Format Executable file formats