HOME

TheInfoList



OR:

The SPARC64 V (''Zeus'') is a
SPARC V9 SPARC (Scalable Processor Architecture) is a reduced instruction set computer (RISC) instruction set architecture originally developed by Sun Microsystems. Its design was strongly influenced by the experimental Berkeley RISC system developed ...
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
designed by
Fujitsu is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...
. The SPARC64 V was the basis for a series of successive processors designed for servers, and later, supercomputers. The servers series are the SPARC64 V+, VI, VI+, VII, VII+, X, X+ and XII. The SPARC64 VI and its successors up to the VII+ were used in the Fujitsu and Sun (later
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
) SPARC Enterprise M-Series servers. In addition to servers, a version of the SPARC64 VII was also used in the commercially available Fujitsu FX1 supercomputer. As of October 2017, the SPARC64 XII is the latest server processor, and it is used in the Fujitsu and Oracle M12 servers. The supercomputer series was based on the SPARC64 VII, and are the SPARC64 VIIfx, IXfx, and XIfx. The SPARC64 VIIIfx was used in the
K computer The K computer named for the Japanese word/numeral , meaning 10 quadrillion (1016)See Japanese numbers was a supercomputer manufactured by Fujitsu, installed at the Riken Advanced Institute for Computational Science campus in Kobe, Hyōgo Pref ...
, and the SPARC64 IXfx in the commercially available
PRIMEHPC FX10 The PRIMEHPC FX10 is a supercomputer designed and manufactured by Fujitsu. Announced on 7 November 2011 at the Supercomputing Conference, the PRIMEHPC FX10 is an improved and commercialized version of the K computer, which was the first supercompute ...
. As of July 2016, the SPARC64 XIfx is the latest supercomputer processor, and it is used in the Fujitsu PRIMEHPC FX100 supercomputer.


History

In the late 1990s,
HAL Computer Systems HAL Computer Systems, Inc was a Campbell, California-based computer manufacturer founded in 1990 by Andrew Heller, a principal designer of the original IBM POWER architecture. His idea was to build computers based on a RISC architecture for the ...
, a subsidiary of Fujitsu, was designing a successor to the SPARC64 GP as the SPARC64 V. First announced at Microprocessor Forum 1999, the HAL SPARC64 V would have operated 1 GHz and had a wide
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
organization with superspeculation, an L1 instruction
trace cache In computer architecture, a trace cache or execution trace cache is a specialized instruction cache which stores the dynamic stream of instructions known as trace. It helps in increasing the instruction fetch bandwidth and decreasing power consump ...
, a small but very fast 8 KB L1 data cache, and separate L2 caches for instructions and data. It was designed in Fujitsu's CS85 process, a 0.17 μm CMOS process with six levels of copper interconnect; and would have consisted of 65 million transistors on a 380 mm2 die. Originally scheduled for a late 2001 release in Fujitsu GranPower servers, it was canceled in mid-2001 when HAL was closed by Fujitsu, and replaced by a Fujitsu design. The first Fujitsu SPARC64 Vs were fabricated in December 2001. They operated at 1.1 to 1.35 GHz. Fujitsu's 2003 SPARC64 roadmap showed that the company planned a 1.62 GHz version for release in late 2003 or early 2004, but it was canceled in favor of the SPARC64 V+. The SPARC64 V was used by Fujitsu in their PRIMEPOWER servers. The SPARC64 V was first presented at Microprocessor Forum 2002. At introduction, it had the highest clock frequency of both SPARC and 64-bit server processors in production; and the highest
SPEC Spec may refer to: *Specification (technical standard), an explicit set of requirements to be satisfied by a material, product, or service **datasheet, or "spec sheet" People * Spec Harkness (1887-1952), American professional baseball pitcher ...
rating of any SPARC processor.


Description

The SPARC64 V is a four-issue
superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
microprocessor with
out-of-order execution In computer engineering, out-of-order execution (or more formally dynamic execution) is a paradigm used in most high-performance central processing units to make use of instruction cycles that would otherwise be wasted. In this paradigm, a proce ...
. It was based on the Fujitsu GS8900
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...
microprocessor."SPARC64 V Processor For UNIX Server"


Pipeline

The SPARC64 V fetches up to eight instructions from the instruction cache during the first stage and places them into a 48-entry instruction buffer. In the next stage, four instructions are taken from this buffer, decoded and issued to the appropriate reserve stations. The SPARC64 V has six reserve stations, two that serve the integer units, one for the address generators, two for the floating-point units, and one for branch instructions. Each integer, address generator and floating-point unit has an eight-entry reserve station. Each reserve station can dispatch an instruction to its execution unit. Which instruction is dispatched firstly depends on operand availability and then its age. Older instructions are given higher priority than newer ones. The reserve stations can dispatch instructions speculatively (speculative dispatch). That is, instructions can be dispatched to the execution units even when their operands are not yet available but will be when execution begins. During stage six, up to six instructions are dispatched.


Register read

The register files are read during stage seven. The SPARC architecture has separate register files for integer and floating-point instructions. The integer register file has eight register windows. The JWR (Joint Work Register) contains 64 entries and has eight read ports and two write ports. The JWR contains a subset of the eight register windows, the previous, current and next register windows. Its purpose is reduce the size of register file so that the microprocessor can operate at higher clock frequencies. The floating-point register file contains 64 entries and has six read ports and two write ports.


Execution

Execution begins during stage nine. There are six execution units, two for integer, two for loads and stores, and two for floating-point. The two integer execution units are designated EXA and EXB. Both have an
arithmetic logic unit In computing, an arithmetic logic unit (ALU) is a Combinational logic, combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on ...
(ALU) and a shift unit, but only EXA has multiply and divide units. Loads and stores are executed by two address generators (AGs) designated AGA and AGB. These are simple ALUs used to calculate virtual addresses. The two floating-point units (FPUs) are designated FLA and FLB. Each FPU contains an adder and a multiplier, but only FLA has a graphics unit attached. They execute add, subtract, multiply, divide, square root and multiply–add instructions. Unlike its successor
SPARC64 VI SPARC64 or sparc64 may refer to: * sparc64, an alternative name used by free software projects for the SPARC V9 instruction set architecture * HAL SPARC64, a microprocessor designed by HAL Computer Systems {{Letter-NumberCombDisambig ...
, the SPARC64 V performs the multiply–add with separate multiplication and addition operations, thus with up to two rounding errors."SPARC64 VI Extensions"
page 56, Fujitsu Limited, Release 1.3, 27 March 2007
The graphics unit executes
Visual Instruction Set Visual Instruction Set, or VIS, is a SIMD instruction set extension for SPARC V9 microprocessors developed by Sun Microsystems. There are five versions of VIS: VIS 1, VIS 2, VIS 2+, VIS 3 and VIS 4. History VIS 1 was introduced in 1994 and was fi ...
(VIS) instructions, a set of
single instruction, multiple data Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
(SIMD) instructions. All instructions are pipelined except for divide and square root, which are executed using iterative algorithms. The FMA instruction is implemented by reading three operands from the operand register, multiplying two of the operands, forwarding the result and the third operand to the adder, and adding them to produce the final result. Results from the execution units and loads are not written to the register file. To maintain program order, they are written to update buffers, where they reside until committed. The SPARC64 V has separate update buffers for integer and floating-point units. Both have 32 entries each. The integer register has eight read ports and four write ports. Half of the write ports are used for results from the integer execution units and the other half by data returned by loads. The floating-point update buffer has six read ports and four write ports. Commit takes place during stage ten at the earliest. The SPARC64 V can commit up to four instructions per cycle. During stage eleven, results are written to the register file, where it becomes visible to software."Microarchitecture and Performance Analysis of a SPARC-V9 Microprocessor for Enterprise Server Systems", p. 4.


Cache

The SPARC64 V has two-level cache hierarchy. The first level consists of two caches, an instruction cache and a data cache. The second level consists of an on-die unified cache. The level 1 (L1) caches each have a capacity of 128 KB. They are both two-way set associative and have 64-byte line size. They are virtually indexed and physically tagged. The instruction cache is accessed via a 256-bit bus. The data cache is accessed with two 128-bit buses. The data cache consists of eight banks separated by 32-bit boundaries. It uses a write-back policy. The data cache writes to the L2 cache with its own 128-bit unidirectional bus. The second level cache has a capacity of 1 or 2 MB and the set associativity depends on the capacity.


System bus

The microprocessor has a 128-bit system bus that operates at 260 MHz. The bus can operate in two modes, single-data rate (SDR) or double-data (DDR) rate, yielding a peak bandwidth of 4.16 or 8.32 GB/s, respectively.


Physical

The SPARC64 V consisted of 191 million transistors, of which 19 million are contained in logic circuits.p. 702. It was fabricated in a 0.13 μm, eight-layer copper metallization,
complementary metal–oxide–semiconductor Complementary metal–oxide–semiconductor (CMOS, pronounced "sea-moss", ) is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) fabrication process that uses complementary and symmetrical pairs of p-type and n-type MOSFE ...
(CMOS)
silicon on insulator In semiconductor manufacturing, silicon on insulator (SOI) technology is fabrication of silicon semiconductor devices in a layered silicon–insulator–silicon substrate, to reduce parasitic capacitance within the device, thereby improving perfo ...
(SOI) process. The die measured 18.14 mm by 15.99 mm for a die area of 290 mm2.


Electrical

At 1.3 GHz, the SPARC64 V has a power dissipation of 34.7 W. The Fujitsu PrimePower servers that use the SPARC64 V supply a slightly higher voltage the microprocessor to enable it to operate at 1.35 GHz. The increased power supply voltage and operating frequency increased the power dissipation to ~45 W.


SPARC64 V+

The SPARC64 V+, code-named "Olympus-B", is a further development of the SPARC64 V. Improvements over the SPARC64 V included higher clock frequencies of 1.82–2.16 GHz and a larger 3 or 4 MB L2 cache. The first SPARC64 V+, a 1.89 GHz version, was shipped in September 2004 in the Fujitsu PrimePower 650 and 850. In December 2004, a 1.82 GHz version was shipped in the PrimePower 2500. These versions have a 3 MB L2 cache. In February 2006, four versions were introduced: 1.65 and 1.98 GHz versions with 3 MB L2 caches shipped in the PrimePower 250 and 450; and 2.08 and 2.16 GHz versions with 4 MB L2 caches shipped in mid-range and high-end models. It contained approximately 400 million transistors on an 18.46 mm by 15.94 mm die for an area of 294.25 mm2. It was fabricated in a
90 nm The 90  nm process refers to the level of MOSFET (CMOS) fabrication process technology that was commercialized by the 2003–2005 timeframe, by leading semiconductor companies like Toshiba, Sony, Samsung, IBM, Intel, Fujitsu, TSMC, Elpid ...
CMOS process with ten levels of
copper interconnect In semiconductor technology, copper interconnects are interconnects made of copper. They are used in silicon integrated circuits (ICs) to reduce propagation delays and power consumption. Since copper is a better conductor than aluminium, ICs usi ...
.


SPARC64 VI

The SPARC64 VI, code-named Olympus-C, is a two-core processor (the first multi-core SPARC64 processor) which succeeded the
SPARC64 V+ The SPARC64 V (''Zeus'') is a SPARC V9 microprocessor designed by Fujitsu. The SPARC64 V was the basis for a series of successive processors designed for servers, and later, supercomputers. The servers series are the SPARC64 V+, VI, VI+, VII, ...
. It is fabricated by Fujitsu in a 90 nm, 10-layer copper, CMOS
silicon on insulator In semiconductor manufacturing, silicon on insulator (SOI) technology is fabrication of silicon semiconductor devices in a layered silicon–insulator–silicon substrate, to reduce parasitic capacitance within the device, thereby improving perfo ...
(SOI) process, which enabled two cores and an L2 cache to be integrated on a die. Each core is a modified
SPARC64 V+ The SPARC64 V (''Zeus'') is a SPARC V9 microprocessor designed by Fujitsu. The SPARC64 V was the basis for a series of successive processors designed for servers, and later, supercomputers. The servers series are the SPARC64 V+, VI, VI+, VII, ...
processor. One of the main improvements is the addition of two-way coarse-grained multi-threading (CMT), which Fujitsu called ''vertical multi-threading'' (VMT). In CMT, which thread is executed is determined by time-sharing, or if the thread is executing a long-latency operation, then execution is switched to the other thread. The addition of CMT required duplication of the program counter and the control, integer, and floating-point registers so there is one set of each for every thread. A floating-point
fused multiply-add Fuse or FUSE may refer to: Devices * Fuse (electrical), a device used in electrical systems to protect against excessive current ** Fuse (automotive), a class of fuses for vehicles * Fuse (hydraulic), a device used in hydraulic systems to protect ...
(FMA) instruction was also added, the first SPARC processor to do so. The cores share a 6 MB on-die unified L2 cache. The L2 cache is 12-way set associative and has 256-byte lines. The cache is accessed via two unidirectional buses, a 256-bit read bus and a 128-bit write bus. The SPARC64 VI has a new system bus, the Jupiter Bus. The SPARC64 VI consisted of 540 million transistors. The die measures 20.38 mm by 20.67 mm (421.25 mm2). The SPARC64 VI was originally to have been introduced in mid-2004 in Fujitsu's PrimePower servers. Development of the PrimerPowers were canceled after Fujitsu and Sun Microsystems announced in June 2004 that they would collaborate on new servers called the Advanced Product Line (APL). These servers were scheduled to be introduced in mid-2006, but were delayed until April 2007, when they were introduced as the
SPARC Enterprise The SPARC Enterprise series is a range of UNIX server computers based on the SPARC V9 architecture. It was co-developed by Sun Microsystems and Fujitsu, announced on June 1st, 2004 and introduced in 2007. They were marketed and sold by Sun Microsyst ...
. The SPARC64 VI processors featured in the SPARC Enterprise at its announcement were a 2.15 GHz version with a 5 MB L2 cache, and 2.28 and 2.4 GHz versions with 6 MB L2 caches.


SPARC64 VII

The SPARC64 VII (previously called the SPARC64 VI+), code-named ''Jupiter'', is a further development of the SPARC64 VI announced in July 2008. It is a quad-core microprocessor. Each core is capable of two-way
simultaneous multithreading Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern process ...
(SMT), which replaces two-way coarse-grained multithreading, termed ''vertical multithreading'' (VMT) by Fujitsu. Thus, it can execute eight threads simultaneously. Other changes include more
RAS Ras or RAS may refer to: Arts and media * RAS Records Real Authentic Sound, a reggae record label * Rundfunk Anstalt Südtirol, a south Tyrolese public broadcasting service * Rás 1, an Icelandic radio station * Rás 2, an Icelandic radio stati ...
features; the integer register file is now protected by ECC, and the number of error checkers has been increased to around 3,400. It consists of 600 million transistors, is 21.31 mm × 20.86 mm (444.63 mm2) large, and is fabricated by Fujitsu in its
65 nm The 65  nm process is an advanced lithographic node used in volume CMOS (MOSFET) semiconductor fabrication. Printed linewidths (i.e. transistor gate lengths) can reach as low as 25 nm on a nominally 65 nm process, while the pitch ...
CMOS, copper interconnect process. The SPARC64 VII was featured in the
SPARC Enterprise The SPARC Enterprise series is a range of UNIX server computers based on the SPARC V9 architecture. It was co-developed by Sun Microsystems and Fujitsu, announced on June 1st, 2004 and introduced in 2007. They were marketed and sold by Sun Microsyst ...
. It is socket-compatible with its predecessor, the SPARC64 VI, and is field-upgradeable. SPARC64 VIIs could coexist, whilst operating at their native clock frequency, alongside SPARC64 VIs. The first versions of the SPARC64 VII were a 2.4 GHz version with a 5 MB L2 cache used in the SPARC Enterprise M4000 and M5000, and a 2.52 GHz version with a 6 MB L2 cache. On 28 October 2008, a 2.52 GHz version with a 5 MB L2 cache was introduced in the SPARC Enterprise M3000. On 13 October 2009, Fujitsu and Sun introduced new versions of the SPARC64 VII (code-named ''Jupiter+''), a 2.53 GHz version with a 5.5 MB L2 cache for the M4000 and M5000, and a 2.88 GHz version with a 6 MB L2 cache for the M8000 and M9000. On 12 January 2010, a 2.75 GHz version with a 5 MB L2 cache was introduced in the M3000.


SPARC64 VII+

The SPARC64 VII+ (''Jupiter-E''), referred to as the M3 by Oracle, is a further development of the SPARC64 VII. The clock frequency was increased up to 3 GHz and the L2 cache size was doubled to 12 MB. This version was announced on 2 December 2010 for the high-end SPARC Enterprise M8000 and M9000 servers. These improvements resulted in an approximately 20% increase to overall performance. A 2.66 GHz version was for mid-range M4000 and M5000 models. On 12 April 2011, a 2.86 GHz version with two or four cores and a 5.5 MB L2 cache was announced for the low-end M3000. The VII+ is socket-compatible with its predecessor, the VII. Existing high-end SPARC Enterprise M-Series servers are able to upgrade to the VII+ processors in the field.


SPARC64 VIIIfx

The SPARC64 VIIIfx (''Venus'') is an eight-core processor based on the SPARC64 VII designed for
high-performance computing High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into a mult ...
(HPC). As a result, the VIIIfx did not succeed the VII, but existed concurrently with it. It consists of 760 million transistors, measures 22.7 mm by 22.6  (513.02 mm2;), is fabricated in Fujitu's 45 nm CMOS process with copper interconnects, and has 1,271 I/O pins. The VIIIfx has a peak performance is 128 
GFLOPS In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
and a typical power consumption of 58 W at 30 °C for an efficiency of 2.2 GFLOPS/W. The VIIIfx has four integrated
memory controller The memory controller is a digital circuit that manages the flow of data going to and from the computer's main memory. A memory controller can be a separate chip or integrated into another chip, such as being placed on the same die or as an int ...
s for a total of eight memory channels. It connects to 64 GB of
DDR3 SDRAM Double Data Rate 3 Synchronous Dynamic Random-Access Memory (DDR3 SDRAM) is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth (" double data rate") interface, and has been in use since 2007. It is the higher-speed ...
and has a peak memory bandwidth of 64 GB/s.


History

The VIIIfx was developed for the Next-Generation Supercomputer Project (also called ''Kei Soku Keisenki'' and Project Keisoku) initiated by Japan's
Ministry of Education, Culture, Sports, Science and Technology The , also known as MEXT or Monka-shō, is one of the eleven Ministries of Japan that composes part of the executive branch of the Government of Japan. Its goal is to improve the development of Japan in relation with the international community ...
in January 2006. The project aimed to produce the world's fastest supercomputer with performance of over 10 PFLOPS by March 2011. The companies contracted to develop the supercomputer were Fujitsu,
Hitachi () is a Japanese multinational corporation, multinational Conglomerate (company), conglomerate corporation headquartered in Chiyoda, Tokyo, Japan. It is the parent company of the Hitachi Group (''Hitachi Gurūpu'') and had formed part of the Ni ...
, and
NEC is a Japanese multinational information technology and electronics corporation, headquartered in Minato, Tokyo. The company was known as the Nippon Electric Company, Limited, before rebranding in 1983 as NEC. It provides IT and network soluti ...
. The supercomputer was originally envisioned to have a hybrid architecture containing
scalar Scalar may refer to: *Scalar (mathematics), an element of a field, which is used to define a vector space, usually the field of real numbers * Scalar (physics), a physical quantity that can be described by a single element of a number field such ...
and
vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
s. The Fujitsu-designed VIIIfx was to have been the scalar processor, with the vector processor to have been jointly designed by Hitachi and NEC. However, due to the Financial crisis of 2007–2008, Hitachi and NEC announced in May 2009 that they would leave the project because manufacturing the hardware they were responsible for would result in financial losses for them. Afterwards, Fujitsu redesigned the supercomputer to use the VIIIfx as its only processor type. By 2010, the supercomputer that would be built by the project was named the
K computer The K computer named for the Japanese word/numeral , meaning 10 quadrillion (1016)See Japanese numbers was a supercomputer manufactured by Fujitsu, installed at the Riken Advanced Institute for Computational Science campus in Kobe, Hyōgo Pref ...
. Located at the RIKEN's Advanced Institute for Computational Science (AICS) in
Kobe Kobe ( , ; officially , ) is the capital city of Hyōgo Prefecture Japan. With a population around 1.5 million, Kobe is Japan's seventh-largest city and the third-largest port city after Tokyo and Yokohama. It is located in Kansai region, whic ...
, Japan; it obtains its performance from 88,128 VIIIfx processors. In June 2011, the
TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
Project Committee announced that the K computer (still incomplete with only 68,544 processors) topped the
LINPACK benchmark The LINPACK Benchmarks are a measure of a system's Floating-point arithmetic, floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense ''n'' by ''n'' system of linear equations ''Ax'' =&nbs ...
at 8.162  PFLOPS, realizing 93% of its peak performance, making it the fastest supercomputer in the world at that time.


Description

The VIIIfx core is based on that of the SPARC64 VII with numerous modifications for HPC, namely High Performance Computing-Arithmetic Computational Extensions (HPC-ACE) a Fujitsu-designed extension to the SPARC V9 architecture. The front-end had coarse-grained multi-threading removed, the L1 instruction cache halved in size to 32 KB; and the number of branch target address cache (BTAC) entries reduced to 1,024 from 8,192, and its
associativity In mathematics, the associative property is a property of some binary operations, which means that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a valid rule of replacement f ...
reduced to two from eight; and an extra pipeline stage was inserted before the instruction decoder. This stage accommodated the greater number of integer and floating-point registers defined by HPC-ACE. The SPARC V9 architecture was designed to have only 32 integer and 32 floating-point number registers. The SPARC V9 instruction encoding limited the number of registers specifiable to 32. To specify the extra registers, HPC-ACE has a "prefix" instruction that would immediately follow one or two SPARC V9 instructions. The prefix instruction contained (primarily) the portions of the register numbers that could not fit within a SPARC V9 instruction. This extra pipeline stage was where up to four SPARC V9 instructions were combined with up to two prefix instructions in the preceding stage. The combined instructions were then decoded in the next pipeline stage. The back-end was also heavily modified. The number of reservation station entries for branch and integer instructions were reduced to six and ten, respectively. Both the integer and floating-point register files had registers added to them: the integer register file gained 32, and there were a total of 256 floating-point registers. The extra integer registers are not part of the
register window In computer engineering, register windows are a feature which dedicates registers to a subroutine by dynamically aliasing a subset of internal registers to fixed, programmer-visible registers. Register windows are implemented to improve the perf ...
s defined by SPARC V9, but are always accessible via the prefix instruction; and the 256 floating-point registers could be used by both scalar floating-point instructions and by both integer and floating-point SIMD instructions. An extra pipeline stage was added to the beginning of the floating-point execution pipeline to access the larger floating-point register file. The 128-bit SIMD instructions from HPC-ACE were implemented by adding two extra floating-point units for a total of four. SIMD execution can perform up four single- or double-precision fused-multiply-add operations (eight FLOPs) per cycle. The number of load queue entries was increased to 20 from 16, and the L1 data cache was halved in size to 32 KB. The number of commit stack entries, which determined the number of instructions that could be in-flight in the back-end, was reduced to 48 from 64.


Miscellaneous specifications

* Physical address range: 41 bits * Cache: :* L1: 32  KB two-way set-associative data, 32 KB two-way set-associative instruction (128-byte cache line), sectored :* L2: 6  MB 12-way set-associative (128-byte line), index-hashed, sectored *
Translation lookaside buffer A translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. ...
(TLB): :* A 16-entry micro-TLB; and 256-entry, four-way set-associative TLB for instructions :* A 512-entry, four-way set-associative TLB for data, no victim cache * Page sizes: 8 KB, 64 KB, 512 KB, 4 MB, 32 MB, 256 MB, 2 GB


SPARC64 IXfx

The SPARC64 IXfx is an improved version of the SPARC64 VIIIfx designed by Fujitsu and LSI first revealed in the announcement of the
PRIMEHPC FX10 The PRIMEHPC FX10 is a supercomputer designed and manufactured by Fujitsu. Announced on 7 November 2011 at the Supercomputing Conference, the PRIMEHPC FX10 is an improved and commercialized version of the K computer, which was the first supercompute ...
supercomputer on 7 November 2011. It, along with the PRIMEHPC FX10, is a commercialization of the technologies that first appeared in the VIIIfx and K computer. Compared to the VIIIfx, organizational improvements included doubling the number of cores was to 16, doubling the amount of shared L2 cache to 12 MB, and increasing peak DDR3 SDRAM memory bandwidth to 85 GB/s. The IXfx operates at 1.848 GHz, has a peak performance of 236.5 GFLOPS, and consumes 110 W for a power efficiency of more than 2 GFLOPS per watt.Morgan, Timothy Prickett (7 November 2011)
"Fujitsu readies 23 petaflops Sparc FX10 super beast"
''
The Register ''The Register'' is a British technology news website co-founded in 1994 by Mike Magee, John Lettice and Ross Alderson. The online newspaper's masthead sublogo is "''Biting the hand that feeds IT''." Their primary focus is information tec ...
''.
It consisted of 1 billion transistors and was implemented in a 40 nm CMOS process with copper interconnects.


SPARC64 X

The SPARC64 X is a 16-core server microprocessor announced in 2012 and used in Fujitsu's M10 servers (which are also marketed by Oracle). The SPARC64 X is based on the SPARC64 VII+ with significant enhancements to its core and chip organization. The cores were improved by the inclusion of a pattern history table for
branch prediction In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
, speculative execution of loads, more execution units, support for the HPC-ACE extension (originally from the SPARC64 VIIIfx), deeper pipeline for a 3.0 GHz clock frequency, and accelerators for
cryptography Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adver ...
,
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
, and decimal floating-point number arithmetic and conversion functions. The 16 cores share a unified, 24 MB, 24-way set-associative L2 cache. Chip organization improvements include four integrated
DDR3 SDRAM Double Data Rate 3 Synchronous Dynamic Random-Access Memory (DDR3 SDRAM) is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth (" double data rate") interface, and has been in use since 2007. It is the higher-speed ...
memory controllers,
glueless In electronics, glue logic is the custom logic circuitry used to interface a number of off-the-shelf integrated circuits. This is often achieved using common, inexpensive 7400- or 4000-series components. In more complex cases, a programmable l ...
four-way symmetrical multiprocessing, ten SERDES channels for symmetrical multiprocessing scalability to 64 sockets, and two integrated PCI Express 3.0 controllers. The SPARC64 X contains 2.95 billion transistors, measures 23.5 mm by 25 mm (637.5 mm2), and is fabricated in a 28 nm CMOS process with copper interconnects.


SPARC64 X+

The SPARC64 X+ is an enhanced SPARC64 X processor announced in 2013. It features minor improvements to the core organization, and a higher 3.5 GHz clock frequency obtained through better circuit design and layout. It contained 2.99 billion transistors, measured 24 mm by 25 mm (600 mm2), and is fabricated in the same process as the SPARC64 X. On 8 April 2014, 3.7 GHz speed-binned parts became available in response to the introduction of new
Xeon Xeon ( ) is a brand of x86 microprocessors designed, manufactured, and marketed by Intel, targeted at the non-consumer workstation, server, and embedded system markets. It was introduced in June 1998. Xeon processors are based on the same arc ...
E5 and E7 models by
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
; and the impending introduction of the
POWER8 POWER8 is a family of superscalar multi-core microprocessors based on the Power ISA, announced in August 2013 at the Hot Chips conference. The designs are available for licensing under the OpenPOWER Foundation, which is the first time for s ...
by IBM.


SPARC64 XIfx

Fujitsu introduced the SPARC64 XIfx in August 2014 at the
Hot Chips The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operation ...
symposium. It is used in the Fujitsu PRIMEHPC FX100 supercomputer, which succeeded the
PRIMEHPC FX10 The PRIMEHPC FX10 is a supercomputer designed and manufactured by Fujitsu. Announced on 7 November 2011 at the Supercomputing Conference, the PRIMEHPC FX10 is an improved and commercialized version of the K computer, which was the first supercompute ...
.''Sparc-Prozessor für 100-Petaflop-Rechner''
Heise Newsticker, 6 August 2014
''Next Generation PRIMEHPC''
Fujitsu Ltd., 2014
The XIfx operates at 2.2 GHz and has a peak performance of 1.1 TFLOPS.
Agam Shah, PC World, 6 August 2014
It consists of 3.75 billion transistors and is fabricated by the
Taiwan Semiconductor Manufacturing Company Taiwan Semiconductor Manufacturing Company Limited (TSMC; also called Taiwan Semiconductor) is a Taiwanese multinational semiconductor contract manufacturing and design company. It is the world's most valuable semiconductor company, the world' ...
in its
20 nm NM, nm, and variations may refer to: Businesses and organizations * Northwestern Mutual, financial services company in Wisconsin, United States * Air Madrid (IATA airline designator NM), Spanish airline * Mount Cook Airline (IATA airline desi ...
high-κ metal gate (HKMG) process. The ''Microprocessor Report'' estimated the die to have an area of 500 mm2; and a typical power consumption of 200 W. The XIfx has 34 cores, 32 of which are ''compute cores'' used to run user applications, and 2 ''assistant cores'' used to run the operating system and other system services. The delegation of user applications and operating system to dedicated cores improves performance by ensuring that the private caches of the compute cores are not shared with or disrupted by non-application instructions and data. The 34 cores are further organized into two ''Core Memory Groups'' (''CMGs''), each consisting of 16 compute cores and 1 assistant core sharing a 12 MB L2 unified cache. The division of the cores into CMGs enabled 34 cores to be integrated on a single die by easing the implementation of cache coherence and avoiding the need for the L2 cache to be shared between 34 cores. The two CMGs share the memory through a
ccNUMA Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non ...
organization. The XIfx core was based on the SPARC64 X+ with organizational improvements. The XIfx implements an improved version of the HPC-ACE extensions (HPC-ACE2), which doubled the width of the
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
units to 256 bits and added new SIMD instructions. Compared to the SPARC64 IXfx, the XIfx has an improvement of a factor of 3.2 for double precision and 6.1 for single precision. To complement the increased width of the SIMD units, the L1 cache bandwidth was increased to 4.4 TB/s. Improvements to the SoC organization were to the memory and interconnect interfaces. The integrated
memory controller The memory controller is a digital circuit that manages the flow of data going to and from the computer's main memory. A memory controller can be a separate chip or integrated into another chip, such as being placed on the same die or as an int ...
s were replaced with four Hybrid Memory Cube (HMC) interfaces for decreased memory latency and improved memory bandwidth. According to the ''Microprocessor Report'', the IXfx was the first processor to use HMCs. The XIfx is connected to 32 GB of memory provided by eight 4 GB HMCs. The HMCs are 16-lane versions, with each lane operating at 15 Gbit/s. Each CMG has two HMC interfaces, and each HMC interface is connected to two HMCs via its own ports. Each CMG has 240 GB/s (120 GB/s in and 120 GB/s out) of memory bandwidth. The XIfx replaced the ten SERDES channels to an external Tofu interconnect controller with a ten-port integrated controller for the second-generation Tofu2 interconnect. Tofu2 is a 6D mesh/torus network with a 25 GB/s full-duplex bandwidth (12.5 GB/s per direction, 125 GB/s for ten ports) and an improved routing architecture.


Future

Fujitsu announced at the
International Supercomputing Conference The ISC High Performance, formerly known as the International Supercomputing Conference, is a yearly conference on supercomputing which has been held in Europe since 1986. It stands as the oldest supercomputing conference in the world. History ...
in June 2016 that its future
exascale Exascale computing refers to computing systems capable of calculating at least "1018 IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second ( exa FLOPS)"; it is a measure of supercomputer performance. Exasca ...
supercomputer will feature processors of its own design that implement the
ARMv8 ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
architecture. The A64FX will implement extensions to the ARMv8 architecture, equivalent to HPC-ACE2, that Fujitsu is developing with
ARM Holdings Arm is a British semiconductor and software design company based in Cambridge, England. Its primary business is in the design of ARM processors (CPUs). It also designs other chips, provides software development tools under the DS-5, RealView an ...
.


SPARC64 XII

Sparc64-XII cores run at 3.9 GHz on the 20nm process by
TSMC Taiwan Semiconductor Manufacturing Company Limited (TSMC; also called Taiwan Semiconductor) is a Taiwanese multinational corporation, multinational semiconductor contract manufacturing and design company. It is the world's most valuable semicon ...
. 5.5 billion transistors and 153 GB/sec memory bandwidth and th
only UNIX vendor able to run Solaris 10 on bare metal
The CPU package features up to 12 cores × 8-way SMT (96 threads).


References

*
Fujitsu Limited is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...
(August 2004). ''SPARC64 V Processor For UNIX Server''. * Krewell, Kevin (24 November 2003). "Fujitsu Makes SPARC See Double". ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
''. * Krewell, Kevin (24 June 2004). "SPARC's New Roadmap. ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
''. * Krewell, Kevin (25 October 2004). "SPARC Turns 90nm". ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
''. * Krewell, Kevin (14 November 2005). "SPARC's Still Going Strong". ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
''. * McGhan, Harlan (25 September 2006). "The Sun-Fujitsu APL Alliance". ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
''. * McGhan, Harlan (23 October 2006). "SPARC64 VI Ready for PrimeTime". ''
Microprocessor Report ''Microprocessor Report'' is a newsletter covering the microprocessor industry. The publication is accessible only to paying subscribers. To avoid bias, it does not take advertisements. The publication provides extensive analysis of new high-perfo ...
''. * Morgan, Timothy Prickett (4 September 2012)
"Fujitsu to embiggen iron bigtime with Sparc64-X"
''
The Register ''The Register'' is a British technology news website co-founded in 1994 by Mike Magee, John Lettice and Ross Alderson. The online newspaper's masthead sublogo is "''Biting the hand that feeds IT''." Their primary focus is information tec ...
''. * Morgan, Timothy Prickett (1 October 2012)
"Fujitsu, Oracle pair up on future 'Athena' Sparc64 chips"
''
The Register ''The Register'' is a British technology news website co-founded in 1994 by Mike Magee, John Lettice and Ross Alderson. The online newspaper's masthead sublogo is "''Biting the hand that feeds IT''." Their primary focus is information tec ...
''. * Morgan, Timothy Prickett (25 January 2013)
"Fujitsu launches 'Athena' Sparc64-X servers in Japan"
''
The Register ''The Register'' is a British technology news website co-founded in 1994 by Mike Magee, John Lettice and Ross Alderson. The online newspaper's masthead sublogo is "''Biting the hand that feeds IT''." Their primary focus is information tec ...
''. * Sakamoto, Mariko et al. (2003). "Microarchitecture and Performance Analysis of a SPARC-V9 Microprocessor for Enterprise Server Systems". ''Proceedings of the 9th International Symposium on High-Performance Computer Architecture''. pp. 141–152.


Further reading

;SPARC64 V * * * * ;SPARC64 VIIIfx * * ;SPARC64 X * * * ;SPARC64 XIfx *


External links


Fujitsu SPARC Servers Roadmap

Fujitsu PRIMEHPC FX100/FX10 Supercomputers

Fujitsu SPARC Servers
* Fujitsu SPARC6
V, VI, VII, VIIIfx, IXfx Extensions
an
X / X+ Specification




{{Fujitsu Fujitsu microprocessors SPARC microprocessors Superscalar microprocessors 64-bit microprocessors