A barrel processor is a
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
that switches between
threads
Thread may refer to:
Objects
* Thread (yarn), a kind of thin yarn used for sewing
** Thread (unit of measurement), a cotton yarn measure
* Screw thread, a helical ridge on a cylindrical fastener
Arts and entertainment
* ''Thread'' (film), 2016 ...
of execution on every
cycle
Cycle, cycles, or cyclic may refer to:
Anthropology and social sciences
* Cyclic history, a theory of history
* Cyclical theory, a theory of American political history associated with Arthur Schlesinger, Sr.
* Social cycle, various cycles in soc ...
. This
CPU design
Processor design is a subfield of computer engineering and electronics engineering (fabrication) that deals with creating a processor, a key component of computer hardware.
The design process involves choosing an instruction set and a certain ex ...
technique is also known as "interleaved" or "fine-grained"
temporal multithreading
Temporal multithreading is one of the two main forms of multithreading that can be implemented on computer processor hardware, the other being simultaneous multithreading. The distinguishing difference between the two forms is the maximum number ...
. Unlike
simultaneous multithreading
Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern process ...
in modern
superscalar
A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...
architectures, it generally does not allow execution of multiple instructions in one cycle.
Like
preemptive multitasking
In computing, preemption is the act of temporarily interrupting an executing task, with the intention of resuming it at a later time. This interrupt is done by an external scheduler with no assistance or cooperation from the task. This preemp ...
, each thread of execution is assigned its own
program counter
The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, is ...
and other
hardware register
In digital electronics, especially computing, hardware registers are circuits typically composed of flip flops, often with many characteristics similar to memory, such as:
* The ability to read or write multiple bits at a time, and
* Using an a ...
s (each thread's
architectural state). A barrel processor can guarantee that each thread will execute one instruction every ''n'' cycles, unlike a
preemptive multitasking
In computing, preemption is the act of temporarily interrupting an executing task, with the intention of resuming it at a later time. This interrupt is done by an external scheduler with no assistance or cooperation from the task. This preemp ...
machine, that typically runs one thread of execution for tens of millions of cycles, while all other threads wait their turn.
A technique called
C-slowing
C-slow retiming is a technique used in conjunction with retiming to improve throughput of a digital circuit. Each Hardware register, register in a circuit is replaced by a set of ''C'' registers (in series). This creates a circuit with ''C'' indepe ...
can automatically generate a corresponding barrel processor design from a single-tasking processor design. An ''n''-way barrel processor generated this way acts much like ''n'' separate
multiprocessing
Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There ar ...
copies of the original single-tasking processor, each one running at roughly 1/''n'' the original speed.
History
One of the earliest examples of a barrel processor was the I/O processing system in the
CDC 6000 series The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which were all extremely rapid ...
supercomputers. These executed one
instruction (or a portion of an instruction) from each of 10 different virtual processors (called peripheral processors) before returning to the first processor.
[CDC Cyber 170 Computer Systems; Models 720, 730, 750, and 760; Model 176 (Level B); CPU Instruction Set; PPU Instruction Set](_blank)
-- See page 2-44 for an illustration of the rotating "barrel". From
CDC 6000 series The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which were all extremely rapid ...
we read that "The peripheral processors are collectively implemented as a barrel processor. Each executes routines independently of the others. They are a loose predecessor of bus mastering or
direct memory access
Direct memory access (DMA) is a feature of computer systems and allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU).
Without DMA, when the CPU is using programmed input/output, it is ...
."
One motivation for barrel processors was to reduce hardware costs. In the case of the CDC 6x00 PPUs, the digital logic of the processor was much faster than the core memory, so rather than having ten separate processors, there are ten separate core memory units for the PPUs, but they all share the single set of processor logic.
Another example is the
Honeywell 800
The Datamatic Division of Honeywell announced the H-800 electronic computer in 1958. The first installation occurred in 1960. A total of 89 were delivered. The H-800 design was part of a family of 48-bit word, three-address instruction format compu ...
, which had 8 groups of registers, allowing up to 8 concurrent programs. After each instruction, the processor would (in most cases) switch to the next active program in sequence.
Barrel processors have also been used as large-scale central processors. The
Tera
TERA is a shielded twisted pair connector for use with Category 7 twisted-pair data cables, developed by The Siemon Company and standardised in 2003 by the International Electrotechnical Commission (IEC) with the reference IEC 61076-3-104.
The ...
MTA (1988) was a large-scale barrel processor design with 128 threads per core.
The MTA architecture has seen continued development in successive products, such as the
Cray Urika-GD
The Cray Urika-GD graph discovery appliance is a computer application that finds and analyzes relationships and patterns in the data collected by a supercomputer.
Introduced in 2012 by Cray Inc., it was the company's maker's first product to pro ...
, originally introduced in 2012 (as the YarcData uRiKA) and targeted at data-mining applications.
Barrel processors are also found in embedded systems, where they are particularly useful for their deterministic
real-time thread performance.
An example is the
XMOS
XMOS is a fabless semiconductor company that develops audio products and multicore microcontrollers.
Company history
XMOS was founded in July 2005 by Ali Dixon, James Foster, Noel Hurley, David May, and Hitesh Mehta. It received seed funding ...
XCore XS1 (2007), a four-stage barrel processor with eight threads per core. (Newer processors from
XMOS
XMOS is a fabless semiconductor company that develops audio products and multicore microcontrollers.
Company history
XMOS was founded in July 2005 by Ali Dixon, James Foster, Noel Hurley, David May, and Hitesh Mehta. It received seed funding ...
also have the same type of architecture.) The XS1 is found in Ethernet, USB, audio, and control devices, and other applications where I/O performance is critical. When the XS1 is programmed in the 'XC' language, software controlled
direct memory access
Direct memory access (DMA) is a feature of computer systems and allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU).
Without DMA, when the CPU is using programmed input/output, it is ...
may be implemented.
Barrel processors have also been used in specialized devices such as the eight-thread
Ubicom IP3023 network I/O processor (2004).
Some 8-bit
microcontrollers by
Padauk Technology feature barrel processors with up to 8 threads per core.
Comparison with single-threaded processors
Advantages
A single-tasking processor spends a lot of time idle, not doing anything useful whenever a
cache miss
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhe ...
or
pipeline stall occurs. Advantages to employing barrel processors over single-tasking processors include:
* The ability to do useful work on the other threads while the stalled thread is waiting.
* Designing an ''n''-way barrel processor with an ''n''-deep
pipeline is much simpler than designing a single-tasking processor because a barrel processor never has a
pipeline stall and doesn't need
feed-forward
Feedforward is the provision of context of what one wants to communicate prior to that communication. In purposeful activity, feedforward creates an expectation which the actor anticipates. When expected experience occurs, this provides confirmato ...
circuits.
* For
real-time applications, a barrel processor can guarantee that a "real-time" thread can execute with precise timing, no matter what happens to the other threads, even if some other thread
locks up in an
infinite loop
In computer programming, an infinite loop (or endless loop) is a sequence of instructions that, as written, will continue endlessly, unless an external intervention occurs ("pull the plug"). It may be intentional.
Overview
This differs from:
* ...
or is
continuously interrupted by
hardware interrupts.
Disadvantages
There are a few disadvantages to barrel processors.
* The state of each thread must be kept on-chip, typically in registers, to avoid costly off-chip context switches. This requires a large number of registers compared to typical processors.
* Either all threads must share the same
cache, which slows overall system performance, or there must be one unit of cache for each execution thread, which can significantly increase the
transistor count
The transistor count is the number of transistors in an electronic device (typically on a single substrate or "chip"). It is the most common measure of integrated circuit complexity (although the majority of transistors in modern microprocessor ...
and thus the cost of such a CPU. However, in
hard real-time embedded system
An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
s where barrel processors are often found, memory access costs are typically calculated assuming worst-case cache behavior, so this is a minor concern. Some barrel processors such as the
XMOS
XMOS is a fabless semiconductor company that develops audio products and multicore microcontrollers.
Company history
XMOS was founded in July 2005 by Ali Dixon, James Foster, Noel Hurley, David May, and Hitesh Mehta. It received seed funding ...
XS1 do not have a cache at all.
See also
*
Super-threading
Temporal multithreading is one of the two main forms of multithreading that can be implemented on computer processor hardware, the other being simultaneous multithreading. The distinguishing difference between the two forms is the maximum number ...
*
Computer multitasking
In computing, multitasking is the concurrent execution of multiple tasks (also known as processes) over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result ...
*
Simultaneous multithreading
Simultaneous multithreading (SMT) is a technique for improving the overall efficiency of superscalar CPUs with hardware multithreading. SMT permits multiple independent threads of execution to better use the resources provided by modern process ...
(SMT)
*
Hyper-threading
Hyper-threading (officially called Hyper-Threading Technology or HT Technology and abbreviated as HTT or HT) is Intel's proprietary simultaneous multithreading (SMT) implementation used to improve parallelization of computations (doing multi ...
*
Vector processor
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
*
Cray XMT
References
External links
Soft peripheralsEmbedded.com article examines Ubicom's IP3023 processor
An Evaluation of the Design of the Gamma 60(French and English)
{{DEFAULTSORT:Barrel Processor
Central processing unit
Instruction processing
Threads (computing)