Barrel processor
   HOME

TheInfoList



OR:

A barrel processor is a CPU that switches between threads of execution on every cycle. This
CPU design Processor design is a subfield of computer engineering and electronics engineering (fabrication) that deals with creating a processor, a key component of computer hardware. The design process involves choosing an instruction set and a certain ...
technique is also known as "interleaved" or "fine-grained" temporal multithreading. Unlike simultaneous multithreading in modern superscalar architectures, it generally does not allow execution of multiple instructions in one cycle. Like
preemptive multitasking In computing, preemption is the act of temporarily interrupting an executing task, with the intention of resuming it at a later time. This interrupt is done by an external scheduler with no assistance or cooperation from the task. This preemp ...
, each thread of execution is assigned its own program counter and other
hardware register In digital electronics, especially computing, hardware registers are circuits typically composed of flip flops, often with many characteristics similar to memory, such as: * The ability to read or write multiple bits at a time, and * Using an ...
s (each thread's architectural state). A barrel processor can guarantee that each thread will execute one instruction every ''n'' cycles, unlike a
preemptive multitasking In computing, preemption is the act of temporarily interrupting an executing task, with the intention of resuming it at a later time. This interrupt is done by an external scheduler with no assistance or cooperation from the task. This preemp ...
machine, that typically runs one thread of execution for tens of millions of cycles, while all other threads wait their turn. A technique called C-slowing can automatically generate a corresponding barrel processor design from a single-tasking processor design. An ''n''-way barrel processor generated this way acts much like ''n'' separate multiprocessing copies of the original single-tasking processor, each one running at roughly 1/''n'' the original speed.


History

One of the earliest examples of a barrel processor was the I/O processing system in the
CDC 6000 series The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which were all extremely rapid ...
supercomputers. These executed one instruction (or a portion of an instruction) from each of 10 different virtual processors (called peripheral processors) before returning to the first processor.CDC Cyber 170 Computer Systems; Models 720, 730, 750, and 760; Model 176 (Level B); CPU Instruction Set; PPU Instruction Set
-- See page 2-44 for an illustration of the rotating "barrel".
From
CDC 6000 series The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which were all extremely rapid ...
we read that "The peripheral processors are collectively implemented as a barrel processor. Each executes routines independently of the others. They are a loose predecessor of bus mastering or
direct memory access Direct memory access (DMA) is a feature of computer systems and allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed input/output, it is ...
." One motivation for barrel processors was to reduce hardware costs. In the case of the CDC 6x00 PPUs, the digital logic of the processor was much faster than the core memory, so rather than having ten separate processors, there are ten separate core memory units for the PPUs, but they all share the single set of processor logic. Another example is the Honeywell 800, which had 8 groups of registers, allowing up to 8 concurrent programs. After each instruction, the processor would (in most cases) switch to the next active program in sequence. Barrel processors have also been used as large-scale central processors. The
Tera TERA is a shielded twisted pair connector for use with Category 7 twisted-pair data cables, developed by The Siemon Company and standardised in 2003 by the International Electrotechnical Commission (IEC) with the reference IEC 61076-3-104. Th ...
MTA (1988) was a large-scale barrel processor design with 128 threads per core. The MTA architecture has seen continued development in successive products, such as the Cray Urika-GD, originally introduced in 2012 (as the YarcData uRiKA) and targeted at data-mining applications. Barrel processors are also found in embedded systems, where they are particularly useful for their deterministic
real-time Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
thread performance. An example is the XMOS XCore XS1 (2007), a four-stage barrel processor with eight threads per core. (Newer processors from XMOS also have the same type of architecture.) The XS1 is found in Ethernet, USB, audio, and control devices, and other applications where I/O performance is critical. When the XS1 is programmed in the 'XC' language, software controlled
direct memory access Direct memory access (DMA) is a feature of computer systems and allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed input/output, it is ...
may be implemented. Barrel processors have also been used in specialized devices such as the eight-thread
Ubicom Ubicom was a company which developed communications and media processor (CMP) and software platforms for real-time interactive applications and multimedia content delivery in the digital home. The company provided optimized system-level solution ...
IP3023 network I/O processor (2004). Some 8-bit microcontrollers by Padauk Technology feature barrel processors with up to 8 threads per core.


Comparison with single-threaded processors


Advantages

A single-tasking processor spends a lot of time idle, not doing anything useful whenever a
cache miss In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
or
pipeline stall In the design of pipelined computer processors, a pipeline stall is a delay in execution of an instruction in order to resolve a hazard. Details In a standard five-stage pipeline, during the decoding stage, the control unit will determine whe ...
occurs. Advantages to employing barrel processors over single-tasking processors include: * The ability to do useful work on the other threads while the stalled thread is waiting. * Designing an ''n''-way barrel processor with an ''n''-deep
pipeline Pipeline may refer to: Electronics, computers and computing * Pipeline (computing), a chain of data-processing stages or a CPU optimization found on ** Instruction pipelining, a technique for implementing instruction-level parallelism within a s ...
is much simpler than designing a single-tasking processor because a barrel processor never has a
pipeline stall In the design of pipelined computer processors, a pipeline stall is a delay in execution of an instruction in order to resolve a hazard. Details In a standard five-stage pipeline, during the decoding stage, the control unit will determine whe ...
and doesn't need
feed-forward Feedforward is the provision of context of what one wants to communicate prior to that communication. In purposeful activity, feedforward creates an expectation which the actor anticipates. When expected experience occurs, this provides confirmato ...
circuits. * For
real-time Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
applications, a barrel processor can guarantee that a "real-time" thread can execute with precise timing, no matter what happens to the other threads, even if some other thread locks up in an
infinite loop In computer programming, an infinite loop (or endless loop) is a sequence of instructions that, as written, will continue endlessly, unless an external intervention occurs ("pull the plug"). It may be intentional. Overview This differs from: * ...
or is continuously interrupted by hardware interrupts.


Disadvantages

There are a few disadvantages to barrel processors. * The state of each thread must be kept on-chip, typically in registers, to avoid costly off-chip context switches. This requires a large number of registers compared to typical processors. * Either all threads must share the same
cache Cache, caching, or caché may refer to: Places United States * Cache, Idaho, an unincorporated community * Cache, Illinois, an unincorporated community * Cache, Oklahoma, a city in Comanche County * Cache, Utah, Cache County, Utah * Cache County ...
, which slows overall system performance, or there must be one unit of cache for each execution thread, which can significantly increase the
transistor count The transistor count is the number of transistors in an electronic device (typically on a single substrate or "chip"). It is the most common measure of integrated circuit complexity (although the majority of transistors in modern microprocessors ...
and thus the cost of such a CPU. However, in
hard real-time Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constrai ...
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
s where barrel processors are often found, memory access costs are typically calculated assuming worst-case cache behavior, so this is a minor concern. Some barrel processors such as the XMOS XS1 do not have a cache at all.


See also

* Super-threading *
Computer multitasking In computing, multitasking is the concurrent execution of multiple tasks (also known as processes) over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result ...
* Simultaneous multithreading (SMT) *
Hyper-threading Hyper-threading (officially called Hyper-Threading Technology or HT Technology and abbreviated as HTT or HT) is Intel's proprietary simultaneous multithreading (SMT) implementation used to improve parallelization of computations (doing multipl ...
*
Vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data calle ...
* Cray XMT


References


External links


Soft peripherals
Embedded.com article examines Ubicom's IP3023 processor
An Evaluation of the Design of the Gamma 60


(French and English) {{DEFAULTSORT:Barrel Processor Central processing unit Instruction processing Threads (computing)