HOME

TheInfoList



OR:

The A64FX is a
64-bit In computer architecture, 64-bit Integer (computer science), integers, memory addresses, or other Data (computing), data units are those that are 64 bits wide. Also, 64-bit central processing unit, CPUs and arithmetic logic unit, ALUs are those ...
ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
designed by
Fujitsu is a Japanese multinational information and communications technology equipment and services corporation, established in 1935 and headquartered in Tokyo. Fujitsu is the world's sixth-largest IT services provider by annual revenue, and the la ...
. The processor is replacing the SPARC64 V as Fujitsu's processor for
supercomputer A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
applications. It powers the Fugaku supercomputer, the fastest supercomputer in the world by
TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
rankings as of June 2020 as well as November 2020, June 2021 and November 2021.


Design

Fujitsu collaborated with
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
to develop the processor; it is the first processor to use the ARMv8.2-A
Scalable Vector Extension AArch64 or ARM64 is the 64-bit extension of the ARM architecture family. It was first introduced with the Armv8-A architecture. Arm releases a new extension every year. ARMv8.x and ARMv9.x extensions and features Announced in October 2011, AR ...
SIMD instruction set with 512-bit vector implementation. It has "Four-operand FMA with Prefix Instruction", i.e. MOVPRFX instruction followed by 3-operand FMA operation (
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
, like
RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
in general, is a 3-operand machine, doesn't have space for 4 operands), which get packed into a single operation in the pipeline. For the processor the designer claim ">90% execution efficiency in (D, S, H) GEMM and INT16/8
dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an algebra ...
". The processor uses 32 gigabytes of
HBM2 High Bandwidth Memory (HBM) is a high-speed computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerato ...
memory with a bandwidth of 1 TB per second. The processor contains 16 PCI Express generation 3 lanes to connect to accelerators (hypothetical e.g.
GPUs A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobil ...
and
FPGAs A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term '' field-programmable''. The FPGA configuration is generally specified using a hardware de ...
). The processor also integrates a TofuD fabric controller with 10 ports implemented as 20 lanes of high-speed 28Gbps to connect multiple nodes in a cluster. The reported transistor count is about 8.8 billion. Each A64FX processor has 4 NUMA nodes, with each NUMA node having 12 compute cores, for a total of 48 cores per processor. Each NUMA node also has its own level 2 cache, HBM2 memory, and assistant cores for non-computational purposes. Fujitsu intends to produce lower specification machines with reduced assistant cores.
Reliability, availability and serviceability Reliability, availability and serviceability (RAS), also known as reliability, availability, and maintainability (RAM), is a computer hardware engineering term involving reliability engineering, high availability, and serviceability design. The p ...
(RAS) capabilities are claimed, i.e. ~128,400 error checkers in total. In June 2020 the Fugaku supercomputer using this processor reached 442 petaFLOPS and became the fastest supercomputer in the world.


Implementations

Fujitsu designed the A64FX for the Fugaku. As of June and November 2020, the Fugaku is the fastest supercomputer in the world by
TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
rankings. Fujitsu intends to sell smaller machines with A64FX processors. Anandtech reported in June 2020 that the cost of a PRIMEHPC FX700 server, with 2 A64FX nodes, was (c. ).
Cray Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed ...
is developing supercomputers using the A64FX. The supercomputer is being built for a consortium in the
United Kingdom The United Kingdom of Great Britain and Northern Ireland, commonly known as the United Kingdom (UK) or Britain, is a country in Europe, off the north-western coast of the continental mainland. It comprises England, Scotland, Wales and North ...
, led by the
University of Bristol , mottoeng = earningpromotes one's innate power (from Horace, ''Ode 4.4'') , established = 1595 – Merchant Venturers School1876 – University College, Bristol1909 – received royal charter , type ...
and also including the
Met Office The Meteorological Office, abbreviated as the Met Office, is the United Kingdom's national weather service. It is an executive agency and trading fund of the Department for Business, Energy and Industrial Strategy and is led by CEO Penelope E ...
, using the Fujitsu processors. It is an upgrade to the Isambard supercomputer which was built with the Marvell
ThunderX2 Cavium was a fabless semiconductor company based in San Jose, California, specializing in ARM-based and MIPS-based network, video and security processors and SoCs. The company was co-founded in 2000 by Syed B. Ali and M. Raghib Hussain, who wer ...
, another ARM architecture microprocessor.
Ookami
is an open testbed system supported by NSF run by
Stony Brook University Stony Brook University (SBU), officially the State University of New York at Stony Brook, is a public research university in Stony Brook, New York. Along with the University at Buffalo, it is one of the State University of New York system's ...
and the
University at Buffalo The State University of New York at Buffalo, commonly called the University at Buffalo (UB) and sometimes called SUNY Buffalo, is a public research university with campuses in Buffalo and Amherst, New York. The university was founded in 1846 ...
providing researchers access to A64FX processors.


See also

*
Comparison of ARMv8-A cores This is a comparison of processors based on the ARM family of instruction sets designed by ARM Holdings and 3rd parties, sorted by version of the ARM instruction set, release and name. ARMv6 ARMv7-A This is a table comparing central proc ...
* SPARC64 V *
ThunderX2 Cavium was a fabless semiconductor company based in San Jose, California, specializing in ARM-based and MIPS-based network, video and security processors and SoCs. The company was co-founded in 2000 by Syed B. Ali and M. Raghib Hussain, who wer ...
another ARM architecture high performance computing microprocessor
Huawei Kunpeng 920
also an ARM high-performance microprocessor, but developed by the Huawei-owned HiSilicon. Only available in China.


References

{{Application ARM-based chips Computer-related introductions in 2019 ARM processors Fujitsu microprocessors 64-bit microprocessors