Fugaku (supercomputer)

picture info	Fugaku (supercomputer) Fugaku is a petascale supercomputer at the Riken Center for Computational Science in Kobe, Japan. It started development in 2014 as the successor to the K computer and made its debut in 2020. It is named after an alternative name for Mount Fuji. It became the fastest supercomputer in the world in the June 2020 TOP500 list as well as becoming the first ARM architecture-based computer to achieve this. At this time it also achieved 1.42 exaFLOPS using the mixed fp16/fp64 precision HPL-AI benchmark. It started regular operations in 2021. Fugaku was superseded as the fastest supercomputer in the world by Frontier in May 2022. Hardware The supercomputer is built with the Fujitsu A64FX microprocessor. This CPU is based on the ARM version 8.2A processor architecture, and adopts the Scalable Vector Extensions for supercomputers. Fugaku was aimed to be about 100 times more powerful than the K computer (i.e. a performance target of 1 exaFLOPS). The initial (June 2020) configura ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Graph500 The Graph500 is a rating of supercomputer systems, focused on data-intensive loads. The project was announced on International Supercomputing Conference in June 2010. The first list was published at the ACM/IEEE Supercomputing Conference in November 2010. New versions of the list are published twice a year. The main performance metric used to rank the supercomputers is GTEPS (giga- traversed edges per second). Richard Murphy from Sandia National Laboratories, says that "The Graph500's goal is to promote awareness of complex data problems", instead of focusing on computer benchmarks like HPL (High Performance Linpack), which TOP500 is based on. Despite its name, there were several hundreds of systems in the rating, growing up to 174 in June 2014. The algorithm and implementation that won the championship is published in the paper titled "Extreme scale breadth-first search on supercomputers". There is also list Green Graph 500, which uses same performance metric, but sorts list ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Fujitsu A64FX The A64FX is a 64-bit ARM architecture microprocessor designed by Fujitsu. The processor is replacing the SPARC64 V as Fujitsu's processor for supercomputer applications. It powers the Fugaku supercomputer, the fastest supercomputer in the world by TOP500 rankings as of June 2020 as well as November 2020, June 2021 and November 2021. Design Fujitsu collaborated with ARM to develop the processor; it is the first processor to use the ARMv8.2-A Scalable Vector Extension SIMD instruction set with 512-bit vector implementation. It has "Four-operand FMA with Prefix Instruction", i.e. MOVPRFX instruction followed by 3-operand FMA operation (ARM, like RISC in general, is a 3-operand machine, doesn't have space for 4 operands), which get packed into a single operation in the pipeline. For the processor the designer claim ">90% execution efficiency in (D, S, H) GEMM and INT16/8 dot product". The processor uses 32 gigabytes of HBM2 memory with a bandwidth of 1 TB per second. The ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Mount Fuji , or Fugaku, located on the island of Honshū, is the highest mountain in Japan, with a summit elevation of . It is the second-highest volcano located on an island in Asia (after Mount Kerinci on the island of Sumatra), and seventh-highest peak of an island on Earth. Mount Fuji is an active stratovolcano that last erupted from 1707 to 1708. The mountain is located about southwest of Tokyo and is visible from there on clear days. Mount Fuji's exceptionally symmetrical cone, which is covered in snow for about five months of the year, is commonly used as a cultural icon of Japan and it is frequently depicted in art and photography, as well as visited by sightseers and climbers. Mount Fuji is one of Japan's along with Mount Tate and Mount Haku. It is a Special Place of Scenic Beauty and one of Japan's Historic Sites. [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Double-precision Floating-point Format Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Floating point is used to represent fractional values, or when a wider range is needed than is provided by fixed point (of the same bit width), even if at the cost of precision. Double precision may be chosen when the range or precision of single precision would be insufficient. In the IEEE 754-2008 standard, the 64-bit base-2 format is officially referred to as binary64; it was called double in IEEE 754-1985. IEEE 754 specifies additional floating-point formats, including 32-bit base-2 ''single precision'' and, more recently, base-10 representations. One of the first programming languages to provide single- and double-precision floating-point data types was Fortran. Before the widespread adoption of IEEE 754-1985, the representation a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	LU Factorization In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish mathematician Tadeusz Banachiewicz in 1938. Definitions Let ''A'' be a square matrix. An LU factorization refers to the factorization of ''A'', with proper row and/or column orderings or permutations, into two factors – a lower triangular matrix ''L'' and an upper triangular matrix ''U'': : A = LU. In the lower triangular matrix all elements above the diagonal are zero, in the upper triangular matrix, all ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	LINPACK Benchmarks The LINPACK Benchmarks are a measure of a system's floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense ''n'' by ''n'' system of linear equations ''Ax'' = ''b'', which is a common task in engineering. The latest version of these benchmarks is used to build the TOP500 list, ranking the world's most powerful supercomputers. The aim is to approximate how fast a computer will perform when solving real problems. It is a simplification, since no single computational task can reflect the overall performance of a computer system. Nevertheless, the LINPACK benchmark performance can provide a good correction over the peak performance provided by the manufacturer. The peak performance is the maximal theoretical performance a computer can achieve, calculated as the machine's frequency, in cycles per second, times the number of operations per cycle it can perform. The actual performance will always be lower than the peak pe ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interfaces (APIs), along with command line shells and utility interfaces, for software compatibility (portability) with variants of Unix and other operating systems. POSIX is also a trademark of the IEEE. POSIX is intended to be used by both application and system developers. Name Originally, the name "POSIX" referred to IEEE Std 1003.1-1988, released in 1988. The family of POSIX standards is formally designated as IEEE 1003 and the ISO/IEC standard number is ISO/IEC 9945. The standards emerged from a project that began in 1984 building on work from related activity in the ''/usr/group'' association. Richard Stallman suggested the name ''POSIX'' (pronounced as ''pahz-icks,'' as in ''positive'', not as ''poh-six'') to the IEEE instead ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Kernel (operating System) The kernel is a computer program at the core of a computer's operating system and generally has complete control over everything in the system. It is the portion of the operating system code that is always resident in memory and facilitates interactions between hardware and software components. A full kernel controls all hardware resources (e.g. I/O, memory, cryptography) via device drivers, arbitrates conflicts between processes concerning such resources, and optimizes the utilization of common resources e.g. CPU & cache usage, file systems, and network sockets. On most systems, the kernel is one of the first programs loaded on startup (after the bootloader). It handles the rest of startup as well as memory, peripherals, and input/output (I/O) requests from software, translating them into data-processing instructions for the central processing unit. The critical code of the kernel is usually loaded into a separate area of memory, which is protected from access by applicatio ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Torus Fusion Torus fusion (tofu) is a proprietary computer network topology for supercomputers developed by Fujitsu. It is a variant of the torus interconnect. The system has been used in the K computer and the Fugaku supercomputer (and their derivatives). Tofu has a six-dimensional mesh/ torus topology, a scalability of over 100,000 nodes, and full-duplex links that have a peak bandwidth of 10 GB/s (5 GB/s per direction). Each node is connected to its own ''InterConnect Controller'' (''ICC'') chip, which contains four Tofu interfaces (one for the node and three for connecting to other ICC chips) and a router. Software support Tofu's six-dimensional mesh/torus topology is abstracted by software to appear as a three-dimensional torus; it is supported by a Tofu-optimized version of the open-source Open MPI Message Passing Interface library. Users can create application programs adapted to either a one-, two-, or three-dimensional torus network. See also * Torus interconnect * K compu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Exascale Computing Exascale computing refers to computing systems capable of calculating at least "1018 IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second (exa FLOPS)"; it is a measure of supercomputer performance. Exascale computing is a significant achievement in computer engineering: primarily, it allows improved scientific applications and better prediction accuracy in domains such as weather forecasting, climate modeling and personalised medicine. Exascale also reaches the estimated processing power of the human brain at the neural level, a target of the Human Brain Project. There has been a race to be the first country to build an exascale computer, typically ranked in the TOP500 list. In 2022, the world's first public exascale computer, '' Frontier'', was announced. , it is the world's fastest supercomputer. Definitions Floating point operations per second (FLOPS) are one measure of computer performance. FLOPS can be recorded in different meas ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Scalable Vector Extension AArch64 or ARM64 is the 64-bit extension of the ARM architecture family. It was first introduced with the Armv8-A architecture. Arm releases a new extension every year. ARMv8.x and ARMv9.x extensions and features Announced in October 2011, ARMv8-A represents a fundamental change to the ARM architecture. It adds an optional 64-bit architecture, named "AArch64", and the associated new "A64" instruction set. AArch64 provides user-space compatibility with the existing 32-bit architecture ("AArch32" / ARMv7-A), and instruction set ("A32"). The 16-32bit Thumb instruction set is referred to as "T32" and has no 64-bit counterpart. ARMv8-A allows 32-bit applications to be executed in a 64-bit OS, and a 32-bit OS to be under the control of a 64-bit hypervisor. ARM announced their Cortex-A53 and Cortex-A57 cores on 30 October 2012. Apple was the first to release an ARMv8-A compatible core (Cyclone) in a consumer product (iPhone 5S). AppliedMicro, using an FPGA, was the first to demo ARMv8 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]