HOME

TheInfoList



OR:

Intel Teraflops Research Chip (codenamed ''Polaris'') is a research
manycore processor Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are us ...
containing 80 cores, using a network-on-chip architecture, developed by
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
's Tera-Scale Computing Research Program. It was manufactured using a 65 nm
CMOS Complementary metal–oxide–semiconductor (CMOS, pronounced "sea-moss", ) is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) fabrication process that uses complementary and symmetrical pairs of p-type and n-type MOSFE ...
process with eight layers of copper interconnect and contains 100 million
transistors upright=1.4, gate (G), body (B), source (S) and drain (D) terminals. The gate is separated from the body by an insulating layer (pink). A transistor is a semiconductor device used to Electronic amplifier, amplify or electronic switch, switch e ...
on a 275 mm2
die Die, as a verb, refers to death, the cessation of life. Die may also refer to: Games * Die, singular of dice, small throwable objects used for producing random numbers Manufacturing * Die (integrated circuit), a rectangular piece of a semicondu ...
. Its design goal was to demonstrate a modular architecture capable of a sustained performance of 1.0
TFLOPS In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate mea ...
while dissipating less than 100 W. Research from the project was later incorporated into
Xeon Phi Xeon Phi was a series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and application program ...
. The technical lead of the project was Sriram R. Vangal. The processor was initially presented at the
Intel Developer Forum The Intel Developer Forum (IDF) was a biannual gathering of technologists to discuss Intel products and products based on Intel products. The first IDF was held in 1997. To emphasize the importance of China, the Spring 2007 IDF was held in Beiji ...
on September 26, 2006 and officially announced on February 11, 2007. A working chip was presented at the 2007
IEEE The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operation ...
International Solid-State Circuits Conference International Solid-State Circuits Conference is a global forum for presentation of advances in solid-state circuits and Systems-on-a-Chip. The conference is held every year in February at the San Francisco Marriott Marquis in downtown San Fr ...
, alongside technical specifications.


Architecture

The chip consists of a 10x8 2D mesh network of cores and nominally operates at 4 GHz.Though the chip was later shown by Intel to run as high as 5.67 GHz. Each core, called a ''tile'' (3 mm2), contains a processing engine and a 5-port wormhole-switched router (0.34 mm2) with mesochronous interfaces, with a bandwidth of 80 GB/s and latency of 1.25 ns at 4 GHz. The processing engine in each tile contains two independent, 9-stage
pipeline Pipeline may refer to: Electronics, computers and computing * Pipeline (computing), a chain of data-processing stages or a CPU optimization found on ** Instruction pipelining, a technique for implementing instruction-level parallelism within a s ...
, single-precision floating-point multiplyaccumulator (FPMAC) units, 3 KB of single-cycle instruction memory and 2 KB of data memory. Each FPMAC unit is capable of performing 2 single-precision floating-point operations per cycle. Each tile has thus an estimated peak performance of 16 GFLOPS at the standard configuration of 4 GHz. A 96-bit
very long instruction word Very long instruction word (VLIW) refers to instruction set architectures designed to exploit instruction level parallelism (ILP). Whereas conventional central processing units (CPU, processor) mostly allow programs to specify instructions to exe ...
(VLIW) encodes up to eight operations per cycle. The custom instruction set includes instructions to send and receive packets into/from the chip's network and well as instructions for sleeping and waking a particular tile. Underneath each tile, a 256 KB SRAM module (codenamed ''Freya'') was 3D stacked, thus bringing memory nearer to the processor to increase overall memory bandwidth to 1 TB/s, at the expense of higher cost, thermal stress and latency, and a small total capacity of 20 MB. The network of Polaris was shown to have a bisection bandwidth of 1.6 Tbit/s at 3.16 GHz and 2.92 Tbit/s at 5.67 GHz. Other prominent features of the Teraflops Research chip include its fine-grained power management with 21 independent sleep regions on a tile and dynamic tile sleep, and very high energy efficiency with 27 GFLOPS/W theoretical peak at 0.6 V and 19.4 GFLOPS/W actual for stencil at 0.75 V.


Issues

Intel aimed to help software development for the new exotic architecture by creating a new
programming model A programming model is an execution model coupled to an API or a particular pattern of code. In this style, there are actually two execution models in play: the execution model of the base programming language and the execution model of the prog ...
, especially for the chip, called Ct. The model never gained the following Intel hoped for and has been eventually incorporated into
Intel Array Building Blocks Intel Array Building Blocks (also known as ArBB) was a C++ library developed by Intel Corporation for exploiting data parallel portions of programs to take advantage of multi-core processors, graphics processing units and Intel Many Integrated Co ...
, a now defunct C++ library.


See also

*
Single-chip Cloud Computer The Single-Chip Cloud Computer (SCC) is a computer processor (CPU) created by Intel Corporation in 2009 that has 48 distinct physical cores that communicate through architecture similar to that of a cloud computer data center. Cores are a part of ...
*
Xeon Phi Xeon Phi was a series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and application program ...


Notes


References

{{Reflist, 30em Intel microprocessors Manycore processors Very long instruction word computing