Runahead
   HOME





Runahead
Runahead is a technique that allows a Processor (computing), computer processor to Speculative execution, speculatively pre-process Instruction (computer science), instructions during CPU cache, cache miss cycles. The pre-processed instructions are used to generate instruction and data stream Instruction prefetch, prefetches by executing instructions leading to cache misses (typically called ''long latency loads'') before they would normally occur, effectively hiding memory latency. In runahead, the processor uses the idle execution resources to calculate instruction and data stream addresses using the available information that is independent of a cache miss. Once the processor has resolved the initial cache miss, all runahead results are discarded, and the processor resumes execution as normal. The primary use case of the technique is to mitigate the effects of the memory wall. The technique may also be used for other purposes, such as pre-computing branch outcomes to achieve high ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


Speculative Execution
Speculative execution is an optimization (computer science), optimization technique where a computer system performs some task that may not be needed. Work is done before it is known whether it is actually needed, so as to prevent a delay that would have to be incurred by doing the work after it is known that it is needed. If it turns out the work was not needed after all, most changes made by the work are reverted and the results are ignored. The objective is to provide more Concurrency (computer science), concurrency if extra Resource (computer science), resources are available. This approach is employed in a variety of areas, including branch predictor, branch prediction in instruction pipeline, pipelined CPU, processors, value prediction for exploiting value locality, prefetching Instruction prefetch, memory and File system, files, and optimistic concurrency control in Relational database management system, database systems. Speculative multithreading is a special case of specu ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


Hardware Scout
Hardware scout is a technique that uses otherwise idle processor execution resources to perform prefetching during cache misses. When a thread is stalled by a cache miss, the processor pipeline checkpoints the register file, switches to runahead mode, and continues to issue instructions from the thread that is waiting for memory. The thread of execution in run-ahead mode is known as a ''scout thread''. When the data returns from memory, the processor restores the register file contents from the checkpoint, and switches back to normal execution mode. The computation during run-ahead mode is discarded by the processor; nevertheless, scouting provides speedup because memory level parallelism (MLP) is increased. The cache lines brought into the cache hierarchy are often used by the processor again when it switches back to normal mode. Rock processor scout Sun's Rock processor (later cancelled) used a form of hardware scout. However, any computations in run-ahead mode that do not ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


picture info

Processor (computing)
In computing and computer science, a processor or processing unit is an electrical component (digital circuit) that performs operations on an external data source, usually memory or some other data stream. It typically takes the form of a microprocessor, which can be implemented on a single or a few tightly integrated metal–oxide–semiconductor integrated circuit chips. In the past, processors were constructed using multiple individual vacuum tubes, multiple individual transistors, or multiple integrated circuits. The term is frequently used to refer to the central processing unit (CPU), the main processor in a system. However, it can also refer to other coprocessors, such as a graphics processing unit (GPU). Traditional processors are typically based on silicon; however, researchers have developed experimental processors based on alternative materials such as carbon nanotubes, graphene, diamond, and alloys made of elements from groups three and five of the periodic tabl ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


Rock Processor
Rock (or ROCK) was a multithreading, multicore, SPARC microprocessor under development at Sun Microsystems. Canceled in 2010, it was a separate project from the SPARC T-Series (CoolThreads/Niagara) family of processors. Rock aimed at higher per-thread performance, higher floating-point performance, and greater SMP scalability than the Niagara family. The Rock processor targeted traditional high-end data-facing workloads, such as back-end database servers, as well as floating-point intensive high-performance computing workloads, whereas the Niagara family targets network-facing workloads such as web servers. Processor core The Rock processor implements the 64-bit SPARC V9 instruction set and the VIS 3.0 SIMD multimedia instruction set extension. Each Rock processor has 16 cores, with each core capable of running two threads simultaneously, yielding 32 threads per chip. Servers built with Rock use FB-DIMMs to increase reliability, speed and density of memory systems. The Rock p ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]




Soft Error
In electronics and computing, a soft error is a type of error where a signal or datum is wrong. Errors may be caused by a defect, usually understood either to be a mistake in design or construction, or a broken component. A soft error is also a signal or datum which is wrong, but is not assumed to imply such a mistake or breakage. After observing a soft error, there is no implication that the system is any less reliable than before. One cause of soft errors is single event upsets from cosmic rays. In a computer's memory system, a soft error changes an instruction in a program or a data value. Soft errors typically can be remedied by cold booting the computer. A soft error will not damage a system's hardware; the only damage is to the data that is being processed. There are two types of soft errors, ''chip-level soft error'' and ''system-level soft error''. Chip-level soft errors occur when particles hit the chip, e.g., when secondary particles from cosmic rays land on the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


picture info

Processor Power Dissipation
Processor power dissipation or processing unit power dissipation is the process in which computer processors consume electrical energy, and dissipate this energy in the form of heat due to the resistance in the electronic circuits. Power management Designing CPUs that perform tasks efficiently without overheating is a major consideration of nearly all CPU manufacturers to date. Historically, early CPUs implemented with vacuum tubes consumed power on the order of many kilowatts. Current CPUs in general-purpose personal computers, such as desktops and laptops, consume power in the order of tens to hundreds of watts. Some other CPU implementations use very little power; for example, the CPUs in mobile phones often use just a few watts of electricity, while some microcontrollers used in embedded systems may consume only a few milliwatts or even as little as a few microwatts. There are a number of engineering reasons for this pattern: * For a given CPU core, energy usage will sc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


picture info

Energy Efficiency Of Computer Hardware
In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed. This rate is typically measured by performance on the LINPACK benchmark when trying to compare between computing systems: an example using this is the Green500 list of supercomputers. Performance per watt has been suggested to be a more sustainable measure of computing than Moore's law. System designers building parallel computers often pick CPUs based on their performance per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself. Spaceflight computers have hard limits on the maximum power available and also have hard requirements on minimum real-time performance. A ratio of processing speed to required electrical power is more useful than raw processing speed. D. J. Shirley; and M. K. McLelland"Th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


Operand Forwarding
Operand forwarding (or data forwarding) is an optimization in pipelined CPUs to limit performance deficits which occur due to pipeline stalls. A data hazard can lead to a pipeline stall when the current operation has to wait for the results of an earlier operation which has not yet finished. Example ADD A B C #A=B+C SUB D C A #D=C-A If these two assembly pseudocode instructions run in a pipeline, after fetching and decoding the second instruction, the pipeline stalls, waiting until the result of the addition is written and read. In some cases all stalls from such read-after-write data hazards can be completely eliminated by operand forwarding: Larry Snyder"Pipeline Review" Technical realization The CPU control unit must implement logic to detect dependencies where operand forwarding makes sense. A multiplexer can then be used to select the proper register or flip-flop to read the operand from. See also *Feed forward (control) A feed forward (sometimes wri ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


picture info

Instruction Set Architecture
In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ''implementation'' of that ISA. In general, an ISA defines the supported instructions, data types, registers, the hardware support for managing main memory, fundamental features (such as the memory consistency, addressing modes, virtual memory), and the input/output model of implementations of the ISA. An ISA specifies the behavior of machine code running on implementations of that ISA in a fashion that does not depend on the characteristics of that implementation, providing binary compatibility between implementations. This enables multiple implementations of an ISA that differ in characteristics such as performance, physical size, and monetary cost (among other things), but t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


picture info

Physical Register File
A register file is an array of processor registers in a central processing unit (CPU). The instruction set architecture of a CPU will almost always define a set of registers which are used to stage data between memory and the functional units on the chip. The register file is part of the architecture and visible to the programmer, as opposed to the concept of transparent caches. In simpler CPUs, these ''architectural registers'' correspond one-for-one to the entries in a physical register file (PRF) within the CPU. More complicated CPUs use register renaming, so that the mapping of which physical entry stores a particular architectural register changes dynamically during execution. Modern integrated circuit-based register files are usually implemented by way of fast static RAMs with multiple ports. Such RAMs are distinguished by having dedicated read and write ports, whereas ordinary multiported SRAMs will usually read and write through the same ports. Register banking is the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


Register Renaming
In computer architecture, register renaming is a technique that abstracts logical processor register, registers from physical registers. Every logical register has a set of physical registers associated with it. When a machine language instruction refers to a particular logical register, the processor transposes this name to one specific physical register on the fly. The physical registers are opaque and cannot be referenced directly but only via the canonical names. This technique is used to eliminate false Data dependency, data dependencies arising from the reuse of registers by successive Instruction (computer science), instructions that do not have any real data dependencies between them. The elimination of these false data dependencies reveals more instruction-level parallelism in an instruction stream, which can be exploited by various and complementary techniques such as superscalar and out-of-order execution for better Computer performance, performance. Problem approach ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]


picture info

Cache Conflict
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs. To be cost-effective, caches must be relatively small. Nevertheless, caches are effective in many areas of computing because typical computer applications access data with a high degree of locality of reference. Such access patterns exhibit temporal locality, where data is requested that has been recently requested, and spatial locality, where data is requested that is stored near data that has already bee ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   [Amazon]