64-bit
In computer architecture, 64-bit Integer (computer science), integers, memory addresses, or other Data (computing), data units are those that are 64 bits wide. Also, 64-bit central processing unit, CPUs and arithmetic logic unit, ALUs are those ...
floating point unit
Floating may refer to:
* a type of dental work performed on horse teeth
* use of an isolation tank
* the guitar-playing technique where chords are sustained rather than scratched
* ''Floating'' (play), by Hugh Hughes
* Floating (psychological phe ...
s (FPUs) and four
central processing unit
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
s (CPUs) able to process 1 billion operations per second. Due to budget constraints, only a single "quadrant" with 64 FPUs and a single CPU was built. Since the FPUs all had to process the same instruction – ADD, SUB etc. – in modern terminology the design would be considered to be
single instruction, multiple data
Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
, or SIMD.
The concept of building a computer using an array of processors came to
Daniel Slotnick Daniel Leonid Slotnick (1931–1985) was an American mathematician and Computer architecture, computer architect. Slotnick, in papers published with John Cocke (computer scientist), John Cocke in 1958, discussed the use of parallel computing, parall ...
while working as a programmer on the
IAS machine
The IAS machine was the first electronic computer built at the Institute for Advanced Study (IAS) in Princeton, New Jersey. It is sometimes called the von Neumann machine, since the paper describing its design was edited by John von Neumann, a ...
in 1952. A formal design did not start until 1960, when Slotnick was working at
Westinghouse Electric
The Westinghouse Electric Corporation was an American manufacturing company founded in 1886 by George Westinghouse. It was originally named "Westinghouse Electric & Manufacturing Company" and was renamed "Westinghouse Electric Corporation" in ...
and arranged development funding under a
US Air Force
The United States Air Force (USAF) is the air service branch of the United States Armed Forces, and is one of the eight uniformed services of the United States. Originally created on 1 August 1907, as a part of the United States Army Signal ...
contract. When that funding ended in 1964, Slotnick moved to the
University of Illinois
The University of Illinois Urbana-Champaign (U of I, Illinois, University of Illinois, or UIUC) is a public land-grant research university in Illinois in the twin cities of Champaign and Urbana. It is the flagship institution of the University ...
Advanced Research Projects Agency
The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military.
Originally known as the Adv ...
(ARPA), they began the design of a newer concept with 256 64-bit processors instead of the original concept with 1,024 1-bit processors.
While the machine was being built at Burroughs, the university began building a new facility to house it. Political tension over the funding from the
US Department of Defense
The United States Department of Defense (DoD, USDOD or DOD) is an executive branch department of the federal government charged with coordinating and supervising all agencies and functions of the government directly related to national secu ...
led to the ARPA and the University fearing for the machine's safety. When the first 64-processor quadrant of the machine was completed in 1972, it was sent to the
NASA Ames Research Center
The Ames Research Center (ARC), also known as NASA Ames, is a major NASA research center at Moffett Federal Airfield in California's Silicon Valley. It was founded in 1939 as the second National Advisory Committee for Aeronautics (NACA) laborat ...
in California. After three years of thorough modification to fix various flaws, ILLIAC IV was connected to the
ARPANET
The Advanced Research Projects Agency Network (ARPANET) was the first wide-area packet-switched network with distributed control and one of the first networks to implement the TCP/IP protocol suite. Both technologies became the technical fou ...
for distributed use in November 1975, becoming the first network-available supercomputer, beating the
Cray-1
The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. Announced in 1975, the first Cray-1 system was installed at Los Alamos National Laboratory in 1976. Eventually, over 100 Cray-1s were sold, making it one of the ...
by nearly 12 months.
Running at half its design speed, the one-quadrant ILLIAC IV delivered 50 MFLOP peak, making it the fastest computer in the world at that time. It is also credited with being the first large computer to use solid-state memory, as well as the most complex computer built to date, with over 1 million gates. Generally considered a failure due to massive budget overruns, the design was instrumental in the development of new techniques and systems for programming parallel systems. In the 1980s, several machines based on ILLIAC IV concepts were successfully delivered.
History
Origins
In June 1952,
Daniel Slotnick Daniel Leonid Slotnick (1931–1985) was an American mathematician and Computer architecture, computer architect. Slotnick, in papers published with John Cocke (computer scientist), John Cocke in 1958, discussed the use of parallel computing, parall ...
began working on the
IAS machine
The IAS machine was the first electronic computer built at the Institute for Advanced Study (IAS) in Princeton, New Jersey. It is sometimes called the von Neumann machine, since the paper describing its design was edited by John von Neumann, a ...
at the
Institute for Advanced Study
The Institute for Advanced Study (IAS), located in Princeton, New Jersey, in the United States, is an independent center for theoretical research and intellectual inquiry. It has served as the academic home of internationally preeminent scholar ...
(IAS) at
Princeton University
Princeton University is a private university, private research university in Princeton, New Jersey. Founded in 1746 in Elizabeth, New Jersey, Elizabeth as the College of New Jersey, Princeton is the List of Colonial Colleges, fourth-oldest ins ...
. The IAS machine featured a bit-parallel math unit that operated on 40-bit
words
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consen ...
. Originally equipped with
Williams tube
The Williams tube, or the Williams–Kilburn tube named after inventors Freddie Williams and Tom Kilburn, is an early form of computer memory. It was the first random-access digital storage device, and was used successfully in several early co ...
memory, a
magnetic drum
Drum memory was a magnetic data storage device invented by Gustav Tauschek in 1932 in Austria. Drums were widely used in the 1950s and into the 1960s as computer memory.
For many early computers, drum memory formed the main working memory ...
from
Engineering Research Associates
Engineering Research Associates, commonly known as ERA, was a pioneering computer firm from the 1950s. ERA became famous for their numerical computers, but as the market expanded they became better known for their drum memory systems. They were ev ...
was later added. This drum had 80 tracks so two words could be read at a time, and each track stored 1,024 bits.
While contemplating the drum's mechanism, Slotnik began to wonder if that was the correct way to build a computer. If the bits of a word were written serially to a single track, instead of in parallel across 40 tracks, then the data could be fed into a bit-serial computer directly from the drum bit-by-bit. The drum would still have multiple tracks and heads, but instead of gathering up a word and sending it to a single ALU, in this concept the data on each track would be read a bit at a time and sent into parallel ALUs. This would be a word-parallel, bit-serial computer.
Slotnick raised the idea at the IAS, but
John von Neumann
John von Neumann (; hu, Neumann János Lajos, ; December 28, 1903 – February 8, 1957) was a Hungarian-American mathematician, physicist, computer scientist, engineer and polymath. He was regarded as having perhaps the widest cove ...
dismissed it as requiring "too many tubes". Slotnick left the IAS in February 1954 to return to school for his PhD and the matter was forgotten.
SOLOMON
After completing his PhD and some post-doc work, Slotnick ended up at IBM. By this time, for scientific computing at least, tubes and drums had been replaced with transistors and
core memory
Core or cores may refer to:
Science and technology
* Core (anatomy), everything except the appendages
* Core (manufacturing), used in casting and molding
* Core (optical fiber), the signal-carrying portion of an optical fiber
* Core, the centra ...
. The idea of parallel processors working on different streams of data from a drum no longer had the same obvious appeal. Nevertheless, further consideration showed that parallel machines could still offer significant performance in some applications; Slotnick and a colleague, John Cocke, wrote a paper on the concept in 1958.
After a short time at IBM and then another at
Aeronca Aircraft
Aeronca, contracted from Aeronautical Corporation of America, located in Middletown, Ohio, is a US manufacturer of engine components and airframe structures for commercial aviation and the defense industry, and a former aircraft manufacturer. F ...
, Slotnick ended up at Westinghouse's Air Arm division, which worked on
radar
Radar is a detection system that uses radio waves to determine the distance (''ranging''), angle, and radial velocity of objects relative to the site. It can be used to detect aircraft, ships, spacecraft, guided missiles, motor vehicles, w ...
and similar systems. Under a contract from the
US Air Force
The United States Air Force (USAF) is the air service branch of the United States Armed Forces, and is one of the eight uniformed services of the United States. Originally created on 1 August 1907, as a part of the United States Army Signal ...
's RADC, Slotnik was able to build a team to design a system with 1,024 bit-serial ALUs, known as "processing elements" or PE's. This design was given the name SOLOMON, after
King Solomon
King is the title given to a male monarch in a variety of contexts. The female equivalent is queen, which title is also given to the consort of a king.
*In the context of prehistory, antiquity and contemporary indigenous peoples, the tit ...
, who was both very wise and had 1,000 wives.
The PE's would be fed instructions from a single master
central processing unit
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
(CPU), the "control unit" or CU. SOLOMON's CU would read instructions from memory, decode them, and then hand them off to the PE's for processing. Each PE had its own memory for holding operands and results, the PE Memory module, or PEM. The CU could access the entire memory via a dedicated
memory bus
In computer architecture, a bus (shortened form of the Latin '' omnibus'', and historically also called data highway or databus) is a communication system that transfers data between components inside a computer, or between computers. This ex ...
, whereas the PE's could only access their own PEM. To allow results from one PE to be used as inputs in another, a separate network connected each PE to its eight closest neighbours.
Several testbed systems were constructed, including a 3-by-3 (9 PE) system and a 10-by-10 model with simplified PEs. During this period, some consideration was given to more complex PE designs, becoming a 24-bit parallel system that would be organized in a 256-by-32 arrangement. A single PE using this design was built in 1963. As the design work continued, the primary sponsor within the
US Department of Defense
The United States Department of Defense (DoD, USDOD or DOD) is an executive branch department of the federal government charged with coordinating and supervising all agencies and functions of the government directly related to national secu ...
was killed in an accident and no further funding was forthcoming.
Looking to continue development, Slotnik approached Livermore, who at that time had been at the forefront of supercomputer purchases. They were very interested in the design but convinced him to upgrade the current design's fixed point math units to true
floating point
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
, which resulted in the SOLOMON.2 design.
Livermore would not fund development, instead, they offered a contract in which they would lease the machine once it was completed. Westinghouse management considered it too risky, and shut down the team. Slotnik left Westinghouse attempting to find
venture capital
Venture capital (often abbreviated as VC) is a form of private equity financing that is provided by venture capital firms or funds to startups, early-stage, and emerging companies that have been deemed to have high growth potential or which ha ...
to continue the project, but failed. Livermore would later select the CDC STAR-100 for this role, as CDC was willing to take on the development costs.
ILLIAC IV
When SOLOMON ended, Slotnick joined the Illinois Automatic Computer design (ILLIAC) team at the University of Illinois at Urbana-Champaign. Illinois had been designing and building large computers for the U.S. Department of Defense and the
Advanced Research Projects Agency
The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military.
Originally known as the Adv ...
(ARPA) since 1949. In 1964 the University signed a contract with ARPA to fund the effort, which became known as ILLIAC IV, since it was the fourth computer designed and created at the University. Development started in 1965, and a first-pass design was completed in 1966.
In contrast to the bit-serial concept of SOLOMON, in ILLIAC IV the PE's were upgraded to be full 64-bit (bit-parallel) processors, using 12,000
gates
Gates is the plural of gate, a point of entry to a space which is enclosed by walls. It may also refer to:
People
* Gates (surname), various people with the last name
* Gates Brown (1939-2013), American Major League Baseball player
* Gates McFadde ...
and 2048-words of
thin-film memory
Thin-film memory is a high-speed alternative to core memory developed by Sperry Rand in a government-funded research project.
Instead of threading individual ferrite cores on wires, thin-film memory consisted of 4-micrometre thick dots of per ...
. The PEs had five 64-bit registers, each with a special purpose. One of these, RGR, was used for communicating data to neighbouring PEs, moving one "hop" per clock cycle. Another register, RGD, indicated whether or not that PE was currently active. "Inactive" PEs could not access memory, but they would pass results to neighbouring PEs using the RGR. The PEs were designed to work as a single 64-bit FPU, two 32-bit half-precision FPUs, or eight 8-bit fixed-point processors.
Instead of 1,024 PEs and a single CU, the new design had a total of 256 PEs arranged into four 64-PE "quadrants", each with its own CU. The CU's were also 64-bit designs, with sixty-four 64-bit registers and another four 64-bit accumulators. The system could run as four separate 64-PE machines, two 128-PE machines, or a single 256-PE machine. This allowed the system to work on different problems when the data was too small to demand the entire 256-PE array.
Based on a 25 MHz clock, with all 256-PEs running on a single program, the machine was designed to deliver 1 billion floating point operations per second, or in today's terminology, 1
GFLOPS
In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
. This made it much faster than any machine in the world; the contemporary
CDC 7600
The CDC 7600 was the Seymour Cray-designed successor to the CDC 6600, extending Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz (27.5 ns clock cycle) and had a 65 Kword primary memory (with a 6 ...
had a clock cycle of 27.5 nanoseconds, or 36 MIPS, although for a variety of reasons it generally offered performance closer to 10 MIPS.
To support the machine, an extension to the Digital Computer Laboratory buildings were constructed. Sample work at the University was primarily aimed at ways to efficiently fill the PEs with data, thus conducting the first "stress test" in computer development. In order to make this as easy as possible, several new
computer language
A computer language is a formal language used to communicate with a computer. Types of computer languages include:
* Construction language – all forms of communication by which a human can specify an executable problem solution to a compu ...
s were created; IVTRAN and TRANQUIL were parallelized versions of FORTRAN, and Glypnir was a similar conversion of
ALGOL
ALGOL (; short for "Algorithmic Language") is a family of imperative computer programming languages originally developed in 1958. ALGOL heavily influenced many other languages and was the standard method for algorithm description used by the ...
. Generally, these languages provided support for loading arrays of data "across" the PEs to be executed in parallel, and some even supported the unwinding of loops into array operations.
Construction, problems
In early 1966, a Request for Proposals was sent out by the University looking for industrial partners interested in building the design. Seventeen responses were received in July, seven responded, and of these three were selected. Several of the responses, including
Control Data
Control Data Corporation (CDC) was a mainframe and supercomputer firm. CDC was one of the nine major United States computer companies through most of the 1960s; the others were IBM, Burroughs Corporation, DEC, NCR, General Electric, Honeywel ...
, attempted to interest them in a
vector processor
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
design instead, but as these were already being designed the team was not interested in building another. In August 1966, eight-month contracts were offered to
RCA
The RCA Corporation was a major American electronics company, which was founded as the Radio Corporation of America in 1919. It was initially a patent trust owned by General Electric (GE), Westinghouse, AT&T Corporation and United Fruit Comp ...
Univac
UNIVAC (Universal Automatic Computer) was a line of electronic digital stored-program computers starting with the products of the Eckert–Mauchly Computer Corporation. Later the name was applied to a division of the Remington Rand company an ...
to bid on the construction of the machine.
Burroughs eventually won the contract, having teamed up with
Texas Instruments
Texas Instruments Incorporated (TI) is an American technology company headquartered in Dallas, Texas, that designs and manufactures semiconductors and various integrated circuits, which it sells to electronics designers and manufacturers globall ...
(TI). Both offered new technical advances that made their bid the most interesting. Burroughs was offering to build a new and much faster version of
thin-film memory
Thin-film memory is a high-speed alternative to core memory developed by Sperry Rand in a government-funded research project.
Instead of threading individual ferrite cores on wires, thin-film memory consisted of 4-micrometre thick dots of per ...
which would improve performance. TI was offering to build 64-pin
emitter-coupled logic
In electronics, emitter-coupled logic (ECL) is a high-speed integrated circuit bipolar transistor logic family. ECL uses an overdriven bipolar junction transistor (BJT) differential amplifier with single-ended input and limited emitter current to ...
(ECL)
integrated circuit
An integrated circuit or monolithic integrated circuit (also referred to as an IC, a chip, or a microchip) is a set of electronic circuits on one small flat piece (or "chip") of semiconductor material, usually silicon. Large numbers of tiny ...
s (ICs) with 20
logic gate
A logic gate is an idealized or physical device implementing a Boolean function, a logical operation performed on one or more binary inputs that produces a single binary output. Depending on the context, the term may refer to an ideal logic gate, ...
s each. At the time, most ICs used 16-pin packages and had between 4 and 7 gates. Using TI's ICs would make the system much smaller.
Burroughs also supplied the specialized
disk drive
Disk storage (also sometimes called drive storage) is a general category of storage mechanisms where data is recorded by various electronic, magnetic, optical, or mechanical changes to a surface layer of one or more rotating disks. A disk drive is ...
s, which featured a separate stationary head for every track and could offer speeds up to 500 Mbit/s and stored about 80 MB per 36" disk. They would also provide a Burroughs B6500 mainframe to act as a front-end controller, loading data from secondary storage and performing other housekeeping tasks. Connected to the B6500 was a 3rd party laser optical recording medium, a write-once system that stored up to 1
Tbit
The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented ...
on thin metal film coated on a strip of polyester sheet carried by a rotating drum. Construction of the new design began at Burroughs' Great Valley Lab. At the time, it was estimated the machine would be delivered in early 1970.
After a year of working on the ICs, TI announced they had failed to be able to build the 64-pin designs. The more complex internal wiring was causing
crosstalk
In electronics, crosstalk is any phenomenon by which a signal transmitted on one circuit or channel of a transmission system creates an undesired effect in another circuit or channel. Crosstalk is usually caused by undesired capacitive, induc ...
in the circuitry, and they asked for another year to fix the problems. Instead, the ILLIAC team chose to redesign the machine based on available 16-pin ICs. This required the system to run slower, using a 16 MHz clock instead of the original 25 MHz. The change from 64-pin to 16-pin cost the project about two years, and millions of dollars. TI was able to get the 64-pin design working after just over another year, and began offering them on the market before ILLIAC was complete.
As a result of this change, the individual
PC board
A printed circuit board (PCB; also printed wiring board or PWB) is a medium used in electrical and electronic engineering to connect electronic components to one another in a controlled manner. It takes the form of a laminated sandwich struc ...
s grew about square to about . This doomed Burroughs' efforts to produce a thin-film memory for the machine, because there was now no longer enough space for the memory to fit within the design's cabinets. Attempts to increase the size of the cabinets to make room for the memory caused serious problems with signal propagation. Slotnick surveyed the potential replacements and picked a semiconductor memory from
Fairchild Semiconductor
Fairchild Semiconductor International, Inc. was an American semiconductor company based in San Jose, California. Founded in 1957 as a division of Fairchild Camera and Instrument, it became a pioneer in the manufacturing of transistors and of int ...
, a decision that was so opposed by Burroughs that a full review by ARPA followed.
In 1969, these problems, combined with the resulting cost overruns from the delays, led to the decision to build only a single 64-PE quadrant, thereby limiting the machine's speed to about 200 MFLOPS. Together, these changes cost the project three years and $6 million. By 1969, the project was spending $1 million a month, and had to be spun out of the original ILLIAC team who were becoming increasingly vocal in their opposition to the project.
Move to Ames
By 1970, the machine was finally being built at a reasonable rate and it was being readied for delivery in about a year. On 6 January 1970, ''
The Daily Illini
''The Daily Illini'', commonly known as the ''DI'', is a student-run newspaper that has been published for the community of the University of Illinois Urbana-Champaign since 1871. Weekday circulation during fall and spring semesters is 7,000; co ...
'', the student newspaper, claimed that the computer would be used to design nuclear weapons. In May, the
Kent State shootings
The Kent State shootings, also known as the May 4 massacre and the Kent State massacre,"These would be the first of many probes into what soon became known as the Kent State Massacre. Like the Boston Massacre almost exactly two hundred years bef ...
took place, and anti-war violence erupted across university campuses.
Slotnick grew to be opposed to the use of the machine on classified research, and announced that as long as it was on the university grounds that all processing that took place on the machine would be publicly released. He also grew increasingly concerned that the machine would be subject to attack by the more radical student groups. a position that seemed wise after the local students joined the 9 May 1970 nationwide student strike by declaring a "day of Illiaction", and especially the 24 August bombing of the mathematics building at the
University of Wisconsin–Madison
A university () is an educational institution, institution of higher education, higher (or Tertiary education, tertiary) education and research which awards academic degrees in several Discipline (academia), academic disciplines. Universities ty ...
.
With the help of
Hans Mark
Hans Michael Mark (June 17, 1929 – December 18, 2021) was a German-born American government official who served as Secretary of the Air Force and as a Deputy Administrator of NASA. He was an expert and consultant in aerospace design and natio ...
, the director of the
NASA Ames Research Center
The Ames Research Center (ARC), also known as NASA Ames, is a major NASA research center at Moffett Federal Airfield in California's Silicon Valley. It was founded in 1939 as the second National Advisory Committee for Aeronautics (NACA) laborat ...
in what was becoming
Silicon Valley
Silicon Valley is a region in Northern California that serves as a global center for high technology and innovation. Located in the southern part of the San Francisco Bay Area, it corresponds roughly to the geographical areas San Mateo County ...
, in January 1971 the decision was made to deliver the machine to Ames rather than the university. Located on an active
US Navy
The United States Navy (USN) is the maritime service branch of the United States Armed Forces and one of the eight uniformed services of the United States. It is the largest and most powerful navy in the world, with the estimated tonnage of ...
base and protected by the
U.S. Marines
The United States Marine Corps (USMC), also referred to as the United States Marines, is the Marines, maritime land force military branch, service branch of the United States Armed Forces responsible for conducting expeditionary warfare, exped ...
, security would no longer be a concern. The machine was finally delivered to Ames in April 1972, and installed in the Central Computer Facility in building N-233. By this point it was several years late and well over budget at a total price of $31 million, almost four times the original estimate of $8 million for the complete 256-PE machine.
NASA also decided to replace the B6500 front-end machine with a
PDP-10
Digital Equipment Corporation (DEC)'s PDP-10, later marketed as the DECsystem-10, is a mainframe computer family manufactured beginning in 1966 and discontinued in 1983. 1970s models and beyond were marketed under the DECsystem-10 name, especi ...
, which were in common use at Ames and would make it much easier to connect to the ARPAnet. This required the development of new software, especially compilers, on the PDP-10. This caused further delays in bringing the machine online.
The Illiac IV was contracted to be managed by ACTS Computing Corporation headquartered in Southfield, MI, a Timesharing and Remote Job Entry (RJE) company that had recently been acquired by the conglomerate, Lear Siegler Corporation. The DoD contracted with ACTS under a cost plus 10% contract. This unusual arrangement was due to the constraint that no government employee could be paid more than a Congress person and many Illiac IV personnel made more than that limit. Dr. Mel Pirtle, with a background from the University of California, Berkeley and the Berkeley Computer Corporation (BCC) was engaged as the Illiac IV's director.
Making it work
When the machine first arrived, it could not be made to work. It suffered from all sorts of problems from cracking PCBs, to bad
resistor
A resistor is a passive two-terminal electrical component that implements electrical resistance as a circuit element. In electronic circuits, resistors are used to reduce current flow, adjust signal levels, to divide voltages, bias active el ...
s, to the packaging of the TI ICs being highly sensitive to humidity. These issues were slowly addressed, and by the summer of 1973 the first programs were able to be run on the system although the results were highly questionable. Starting in June 1975, a concerted four-month effort began that required, among other changes, replacing 110,000 resistors, rewiring parts to fix propagation delay issues, improving filtering in the power supplies, and a further reduction in clock speed to 13 MHz. At the end of this process, the system was finally working properly.
From then on, the system ran Monday morning to Friday afternoon, providing 60 hours of up-time for the users, but requiring 44 hours of scheduled downtime. Nevertheless, it was increasingly used as NASA programmers learned ways to get performance out of the complex system. At first, performance was dismal, with most programs running at about 15 MFLOPS, about three times the average for the
CDC 7600
The CDC 7600 was the Seymour Cray-designed successor to the CDC 6600, extending Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz (27.5 ns clock cycle) and had a 65 Kword primary memory (with a 6 ...
. Over time this improved, notably after Ames programmers wrote their own version of FORTRAN, CFD, and learned how to parallel I/O into the limited PEMs. On problems that could be parallelized the machine was still the fastest in the world, outperforming the CDC 7600 by two to six times, and it is generally credited as the fastest machine in the world until 1981.
On 7 September 1981, after nearly 10 years of operation, the ILLIAC IV was turned off.'This Day in History: September 7', Computer History Museum The machine was officially decommissioned in 1982, and NASA's advanced computing division ended with it. One control unit and one processing element chassis from the machine is now on display at the
Computer History Museum
The Computer History Museum (CHM) is a museum of computer history, located in Mountain View, California. The museum presents stories and artifacts of Silicon Valley and the information age, and explores the computing revolution and its impact on ...
in Mountain View, less than a mile from its operational site.
Aftermath
ILLIAC was very late, very expensive, and never met its goal of producing 1 GFLOP. It was widely considered a failure even by those who worked on it; one stated simply that "any impartial observer has to regard Illiac IV as a failure in a technical sense." In terms of project management it is widely regarded as a failure, running over its cost estimates by four times and requiring years of remedial efforts to make it work. As Slotnik himself later put it:
However, later analyses note that the project had several long-lasting effects on the computer market as a whole, both intentionally and unintentionally.
Among the indirect effects was the rapid update of semiconductor memory after the ILLIAC project. Slotnick received a lot of criticism when he chose
Fairchild Semiconductor
Fairchild Semiconductor International, Inc. was an American semiconductor company based in San Jose, California. Founded in 1957 as a division of Fairchild Camera and Instrument, it became a pioneer in the manufacturing of transistors and of int ...
to produce the memory ICs, as at the time the production line was an empty room and the design existed only on paper. However, after three months of intense effort, Fairchild had a working design being produced ''en masse''. As Slotnick would later comment, "Fairchild did a magnificent job of pulling our chestnuts out of the fire. The Fairchild memories were superb and their reliability to this day is just incredibly good." ILLIAC is considered to have dealt a death blow to
core memory
Core or cores may refer to:
Science and technology
* Core (anatomy), everything except the appendages
* Core (manufacturing), used in casting and molding
* Core (optical fiber), the signal-carrying portion of an optical fiber
* Core, the centra ...
and related systems like thin-film.
Another indirect effect was caused by the complexity of the
printed circuit board
A printed circuit board (PCB; also printed wiring board or PWB) is a medium used in Electrical engineering, electrical and electronic engineering to connect electronic components to one another in a controlled manner. It takes the form of a L ...
s (PCBs), or modules. At the original 25 MHz design speed, impedance in the ground wiring proved to be a serious problem, demanding that the PCBs be as small as possible. As their complexity grew, the PCBs had to add more and more layers in order to avoid growing larger. Eventually, they reached 15-layers deep, which proved to be well beyond the capabilities of draftsmen. The design was ultimately completed using new automated design tools provided by a subcontractor, and the complete design required two years of computer time on a Burroughs mainframe. This was a major step forward in computer aided design, and by the mid-1970s such tools were commonplace.
ILLIAC also led to major research into the topic of parallel processing that had wide-ranging effects. During the 1980s, with the price of microprocessors falling according to Moore's Law, a number of companies created
MIMD
In computing, multiple instruction, multiple data (MIMD) is a technique employed to achieve parallelism. Machines using MIMD have a number of processors that function asynchronously and independently. At any time, different processors may be exe ...
(Multiple Instruction, Multiple Data) to build even more parallel machines, with compilers that could make better use of the parallelism. The
Thinking Machines
Thinking Machines Corporation was a supercomputer manufacturer and artificial intelligence (AI) company, founded in Waltham, Massachusetts, in 1983 by Sheryl Handler and W. Daniel "Danny" Hillis to turn Hillis's doctoral work at the Massachuse ...
CM-5
A Connection Machine (CM) is a member of a series of massively parallel supercomputers that grew out of doctoral research on alternatives to the traditional von Neumann architecture of computers by Danny Hillis at Massachusetts Institute of Techn ...
is an excellent example of the MIMD concept. It was the better understanding of parallelism on ILLIAC that led to the improved compilers and programs that could take advantage of these designs. As one ILLIAC programmer put it, "If anybody builds a fast computer out of a lot of microprocessors, Illiac IV will have done its bit in the broad scheme of things."
Most supercomputers of the era took another approach to higher performance, using a single very high speed
vector processor
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
. Similar to the ILLIAC in some ways, these processor designs loaded up many data elements into a single custom processor instead of a large number of specialized ones. The classic example of this design is the
Cray-1
The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. Announced in 1975, the first Cray-1 system was installed at Los Alamos National Laboratory in 1976. Eventually, over 100 Cray-1s were sold, making it one of the ...
, which had performance similar to the ILLIAC. There was more than a little "backlash" against the ILLIAC design as a result, and for some time the supercomputer market looked on massively parallel designs with disdain, even when they were successful. As
Seymour Cray
Seymour Roger Cray (September 28, 1925 – October 5, 1996 ) was an American
Description
Physical arrangement
Each quadrant of the machine was high, deep and long. Arranged beside the quadrant was its
input/output
In computing, input/output (I/O, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals ...
(I/O) system, whose disk system stored 2.5
GiB
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
and could read and write data at 1 billion
bits per second
In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time.
The bit rate is expressed in the unit bit per second (symbol: bit/s), often in conjunction w ...
, along with the B6700 computer that connected to the machine through the same 1,024-bit-wide interface as the disk system.
The machine consisted of a series of carrier chassis holding a number of the small modules. The majority of these were the Processing Units (PUs), which contained the modules for a single PE, its PEM, and the Memory Logic Unit that handled address translation and I/O. The PUs were identical, so they could be replaced or reordered as required.
Processor details
Each CU had about 30 to 40,000 gates. The CU had sixteen 64-bit registers and a separate sixty-four slot 64-bit "scratchpad", LDB. There were four accumulators, AC0 through AC3, a program counter ILR, and various control registers. The system had a short
instruction pipeline
In computer engineering, instruction pipelining or ILP is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing inco ...
and implemented instruction look ahead.
The PEs had about 12,000 gates. It included four 64-bit registers, using an accumulator A, an operand buffer B and a secondary scratchpad S. The fourth, R, was used to broadcast or receive data from the other PEs. The PEs used a
carry-lookahead adder
A carry-lookahead adder (CLA) or fast adder is a type of electronics adder used in digital logic. A carry-lookahead adder improves speed by reducing the amount of time required to determine carry bits. It can be contrasted with the simpler, b ...
, a leading-one detector for boolean operations, and a barrel shifter. 64-bit additions took about 200 ns and multiplications about 400 ns. The PE's were connected to a private memory bank, the PEM, which held 2,048 64-bit words. Access time was on the order of 250 ns The PEs used a load/store architecture.
The
instruction set
In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
(ISA) contained two separate sets of instructions, one for the CU (or a unit within it, ADVAST) and another for the PEs. Instructions for the PEs were not decoded, and instead sent directly to the FINST register to be sent to the PEs to process. The ADVAST instructions were decoded and entered the CU's processing pipeline.
Logical arrangement
Each quadrant contained 64 PEs and one CU. The CU had access to the entire I/O bus and could address all of the machine's memory. The PEs could only access their own local store, the PEM, of 2,048 64-bit words. Both the PEs and CU could use load and store operations to access the disk system.
The cabinets were so large that it required 240 ns for signals to travel from one end to the other. For this reason, the CU could not be used to coordinate actions, instead, the entire system was clock-synchronous with all operations in the PEs guaranteed to take the same amount of time no matter what the operands were. That way the CU could be sure that the operations were complete without having to wait for results or status codes.
To improve the performance of operations that required the output of one PE's results to be used as the input to another PE, the PEs were connected directly to their neighbours, as well as the ones eight-steps away - for instance, PE1 was directly connected to PE0 and PE2, as well as PE9 and PE45. The eight-away connections allowed faster transport when the data needed to travel between more distant PEs. Each shift of data moved 64-words in a single 125 ns clock cycle.
The system used a one-address format, in which the instructions included the address of one of the operands and the other operand was in the PE's accumulator (the A register). The address was sent to the PE's over a separate "broadcast" bus. Depending on the instruction, the value on the bus might refer to a memory location in the PE's PEM, a value in one of the PE registers, or a numeric constant.
Since each PE had its own memory, while the instruction format and the CUs saw the entire address space, the system included an
index register
An index register in a computer's CPU is a processor register (or an assigned memory location) used for pointing to operand addresses during the run of a program. It is useful for stepping through strings and arrays. It can also be used for hol ...
(X) to offset the base address. This allowed, for example, the same instruction stream to work on data that was not aligned in the same locations in different PEs. The common example would be an array of data that was loaded into different locations in the PEMs, which could then be made uniform by setting the index in the different PEs.
Branches
In traditional computer designs, instructions are loaded into the CPU one at a time as they are read from memory. Normally, when the CPU completes processing an instruction, the
program counter
The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, is ...
(PC) is incremented by one word and the next instruction is read. This process is interrupted by
branches
A branch, sometimes called a ramus in botany, is a woody structural member connected to the central trunk of a tree (or sometimes a shrub). Large branches are known as boughs and small branches are known as twigs. The term ''twig'' usually ...
, which causes the PC to jump to one of two locations depending on a test, like whether a given memory address holds a non-zero value. In the ILLIAC design, each PE would be applying this test to different values, and thus have different outcomes. Since those values are private to the PE, the following instructions would need to be loaded based on a value only the PE knew.
To avoid the delays reloading the PE instructions would cause, the ILLIAC loaded the PEMs with the instructions on both sides of the branch. Logical tests did not change the PC, instead, they set "mode bits" that told the PE whether or not to run the next arithmetic instruction. To use this system, the program would be written so that one of the two possible instruction streams followed the test, and ended with an instruction to invert the bits. Code for the second branch would then follow, ending with an instruction to set all the bits to 1.
If the test selected the "first" branch, that PE would continue on as normal. When it reached the end of that code, the mode operator instruction would flip the mode bits, and from then on that PE would ignore further instructions. This would continue until it reached the end of the code for the second branch, where the mode reset instruction would turn the PE back on. If a particular PE's test resulted in the second branch being taken, it would instead set the mode bits to ignore further instructions until it reached the end of the first branch, where the mode operator would flip the bits and cause the second branch to begin processing, once again turning them all on at the end of that branch.
Since the PEs can operate in 64-, 32- and 8-bit modes, the mode flags had multiple bits so the individual words could be turned on or off. For instance, in the case when the PE was operating in 32-bit mode, one "side" of the PE might have the test come out true while the other side was false.
Terminology
* CU: control unit
* CPU: central processing unit
* ISA: instruction set architecture
* MAC: multiply-and-accumulate
* PC: program counter
* PE: processing element
* PEM: processing element memory module
* PU: processing unit
See also
*
Amdahl's law
In computer architecture, Amdahl's law (or Amdahl's argument) is a formula which gives the theoretical speedup in latency of the execution of a task at fixed workload that can be expected of a system whose resources are improved. It states tha ...
, which suggests there are limits to the performance increase of parallel computers
*
ILLIAC III The ILLIAC III was a fine-grained SIMD pattern recognition computer built by the University of Illinois in 1966.
This ILLIAC's initial task was image processing of bubble chamber experiments used to detect nuclear particles. Later it was used on bi ...
, a special-purpose SIMD machine built around the same time as ILLIAC IV
* Parallel Element Processing Ensemble, another massively-parallel Burroughs machine, this one a
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
Charles Babbage Institute
The IT History Society (ITHS) is an organization that supports the history and scholarship of information technology by encouraging, fostering, and facilitating archival and historical research. Formerly known as the Charles Babbage Foundation, ...
, University of Minnesota.
Sutherland
Sutherland ( gd, Cataibh) is a historic county, registration county and lieutenancy area in the Highlands of Scotland. Its county town is Dornoch. Sutherland borders Caithness and Moray Firth to the east, Ross-shire and Cromartyshire (later ...
describes his tenure from 1963-65 as head of the
Information Processing Techniques Office
The Information Processing Techniques Office (IPTO), originally "Command and Control Research",Lyon, Matthew; Hafner, Katie (1999-08-19). ''Where Wizards Stay Up Late: The Origins Of The Internet'' (p. 39). Simon & Schuster. Kindle Edition. was par ...
Computer History Museum
The Computer History Museum (CHM) is a museum of computer history, located in Mountain View, California. The museum presents stories and artifacts of Silicon Valley and the information age, and explores the computing revolution and its impact on ...