SW26010
   HOME

TheInfoList



OR:

The SW26010 is a 260-core
manycore processor Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are use ...
designed by the Shanghai Integrated Circuit Technology and Industry Promotion Center (ICC for short)(
Chinese Chinese can refer to: * Something related to China * Chinese people, people of Chinese nationality, citizenship, and/or ethnicity **''Zhonghua minzu'', the supra-ethnic concept of the Chinese nation ** List of ethnic groups in China, people of va ...
: 上海集成电路技术与产业促进中心 (简称ICC)). It implements the
Sunway architecture Sunway, or Shenwei, (Chinese: ), is a series of computer microprocessors, developed by Jiangnan Computing Lab () in Wuxi, China. It uses a reduced instruction set computer (RISC) architecture, but details are still sparse. History The Sunway se ...
, a 64-bit
reduced instruction set computing In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
(RISC) architecture designed in
China China, officially the People's Republic of China (PRC), is a country in East Asia. It is the world's most populous country, with a population exceeding 1.4 billion, slightly ahead of India. China spans the equivalent of five time zones and ...
. The SW26010 has four clusters of 64 ''Compute-Processing Elements'' (CPEs) which are arranged in an eight-by-eight array. The CPEs support
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
instructions and are capable of performing eight
double-precision Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Flo ...
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
operations per cycle. Each cluster is accompanied by a more conventional general-purpose
core Core or cores may refer to: Science and technology * Core (anatomy), everything except the appendages * Core (manufacturing), used in casting and molding * Core (optical fiber), the signal-carrying portion of an optical fiber * Core, the central ...
called the ''Management Processing Element'' (MPE) that provides supervisory functions. Each cluster has its own dedicated
DDR3 SDRAM Double Data Rate 3 Synchronous Dynamic Random-Access Memory (DDR3 SDRAM) is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth (" double data rate") interface, and has been in use since 2007. It is the higher-speed ...
controller Controller may refer to: Occupations * Controller or financial controller, or in government accounting comptroller, a senior accounting position * Controller, someone who performs agent handling in espionage * Air traffic controller, a person ...
and a
memory bank A memory bank is a logical unit of storage in electronics, which is hardware-dependent. In a computer, the memory bank may be determined by the memory controller along with physical organization of the hardware memory slots. In a typical synchron ...
with its own
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve st ...
. The processor runs at a
clock speed In computing, the clock rate or clock speed typically refers to the frequency at which the clock generator of a processor can generate pulses, which are used to synchronize the operations of its components, and is used as an indicator of the pr ...
of 1.45 GHz. The CPE cores feature 64  KB of
scratchpad memory Scratchpad memory (SPM), also known as scratchpad, scratchpad RAM or local store in computer terminology, is a high-speed internal memory used for temporary storage of calculations, data, and other work in progress. In reference to a microprocesso ...
for data and 16 KB for instructions, and communicate via a
network on a chip A network on a chip or network-on-chip (NoC or )This article uses the convention that "NoC" is pronounced . Therefore, it uses the convention "a" for the indefinite article corresponding to NoC ("a NoC"). Other sources may pronounce it as an ...
, instead of having a traditional
cache hierarchy Cache hierarchy, or multi-level caches, refers to a memory architecture that uses a hierarchy of memory stores based on varying access speeds to cache data. Highly requested data is cached in high-speed access memory stores, allowing swifter access ...
. The MPEs have a more traditional setup, with 32 KB L1 instruction and
data cache A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which ...
s and a 256 KB
L2 cache A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which ...
. Finally, the on-chip network connects to a single system interconnection interface that connects the chip to the outside world. The SW26010 is used in the
Sunway TaihuLight The Sunway TaihuLight ( ''Shénwēi·tàihú zhī guāng'') is a Chinese supercomputer which, , is ranked fourth in the TOP500 list, with a LINPACK benchmark rating of 93 petaflops. The name is translated as ''divine power, the light of Taihu Lak ...
supercomputer A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
, which between March and June 2018, was the world's fastest supercomputer as ranked by the
TOP500 The TOP500 project ranks and details the 500 most powerful non-distributed computing, distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these ...
project. The system uses 40,960 SW26010s to obtain 93.01  PFLOPS on the
LINPACK benchmark The LINPACK Benchmarks are a measure of a system's Floating-point arithmetic, floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense ''n'' by ''n'' system of linear equations ''Ax'' =&nbs ...
.


Successor: SW26010P

SW26010P includes 6 core groups (CGs), each of which includes one management processing element (MPE), and one 8×8 computing processing element (CPE) cluster. Each CG has its memory controller (MC), connecting to 16GB of DDR4 memory with a bandwidth of 51.2 GB/s. The data exchange between every two CPEs in the same CPE cluster is achieved through the Remote Memory Access (RMA) interface (a replacement of the register communication feature in the previous generation). Each CPE has a fast local data memory (LDM) of 256KB. Each SW26010P processor consists of 390 processing elements.


See also

*
Massively parallel processor array A massively parallel processor array, also known as a multi purpose processor array (MPPA) is a type of integrated circuit which has a massively parallel array of hundreds or thousands of CPUs and RAM memories. These processors pass work to one an ...
*
Loongson Loongson () is the name of a family of general-purpose, MIPS architecture-compatible microprocessors, as well as the name of the Chinese fabless company (Loongson Technology) that develops them. The processors are alternately called Godson proc ...
*
Adapteva Zero ASIC Corporation, formerly Adapteva, Inc., is a fabless semiconductor company focusing on low power many core microprocessor design. The company was the second company to announce a design with 1,000 specialized processing cores on a single i ...
*
Cell (microprocessor) Cell is a multi-core microprocessor microarchitecture that combines a general-purpose PowerPC core of modest performance with streamlined coprocessing elements which greatly accelerate multimedia and vector processing applications, as well as ma ...


References

Manycore processors Sunway microprocessors Supercomputing in China {{computing-stub