Runahead
   HOME

TheInfoList



OR:

{{No footnotes, date=November 2010 Runahead is a technique that allows a
microprocessor A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
to pre-process
instructions Instruction or instructions may refer to: Computing * Instruction, one operation of a processor within a computer architecture instruction set * Computer program, a collection of instructions Music * Instruction (band), a 2002 rock band from Ne ...
during
cache Cache, caching, or caché may refer to: Places United States * Cache, Idaho, an unincorporated community * Cache, Illinois, an unincorporated community * Cache, Oklahoma, a city in Comanche County * Cache, Utah, Cache County, Utah * Cache County ...
miss cycles instead of stalling. The pre-processed instructions are used to generate instruction and
data stream In connection-oriented communication, a data stream is the transmission of a sequence of digitally encoded coherent signals to convey information. Typically, the transmitted symbols are grouped into a series of packets. Data streaming has bec ...
prefetches by detecting
cache misses In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
before they would otherwise occur by using the idle execution resources to calculate instruction and data stream fetch addresses using the available information that is independent of the cache miss. The principal hardware cost is a means of
checkpointing Checkpointing is a technique that provides fault tolerance for computing systems. It basically consists of saving a snapshot of the application's state, so that applications can restart from that point in case of failure. This is particularly imp ...
the
register Register or registration may refer to: Arts entertainment, and media Music * Register (music), the relative "height" or range of a note, melody, part, instrument, etc. * ''Register'', a 2017 album by Travis Miller * Registration (organ), th ...
file state and preventing pre-processed stores from modifying
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
. This checkpointing can be accomplished using very little hardware since all results computed during runahead are discarded after the cache miss has been serviced, at which time normal execution resumes using the checkpointed
register file A register file is an array of processor registers in a central processing unit (CPU). Register banking is the method of using a single name to access multiple different physical registers depending on the operating mode. Modern integrated circuit- ...
state. Branch outcomes computed during runahead mode can be saved into a
shift register A shift register is a type of digital circuit using a cascade of flip-flops where the output of one flip-flop is connected to the input of the next. They share a single clock signal, which causes the data stored in the system to shift from one loc ...
, which can be used as a highly accurate
branch predictor In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...
when normal operation resumes. Runahead was initially investigated in the context of an in-order microprocessor; however, this technique has been extended for use with out-of-order microprocessors.


Entering runahead

When a runahead processor detects a level one instruction or data cache miss it records the instruction address of the faulting access and enters runahead mode. A demand fetch for the missing instruction or data cache line is generated if necessary. The processor checkpoints the register file by one of several mechanisms (discussed later). The state of the
memory hierarchy In computer architecture, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlli ...
is checkpointed by disabling stores. Store instructions are allowed to compute addresses and check for a hit, but they are not allowed to write to memory. Because the value returned from a cache miss cannot be known ahead of time, it is possible for pre-processed instructions to be dependent upon invalid data. These are denoted by adding an "invalid" or INV bit to every register in the register file. If runahead was initiated by a load instruction, the load's destination register is marked INV.


Pre-processing instructions

The processor then continues to execute instructions after the miss; however, all results are strictly temporary and are only used to attempt to generate additional load, store, and instruction cache misses, which are turned into prefetches. The designer can opt to allow runahead to skip over instructions that are not present in the instruction cache with the understanding that the quality of any prefetches generated will be reduced since the effect of the missing instructions is unknown. Registers that are the target of an instruction that has one or more source registers marked INV are marked INV. This allows the processor to know which register values can be (reasonably) trusted during runahead mode. Branch instructions that cannot be resolved due to INV sources are simply assumed to have had their direction predicted correctly. Branch outcomes are saved in a shift register for later use as highly accurate predictions during normal operation. Note that it is not possible to perfectly track INV register values during runahead mode. This is not required since runahead is only used to optimize performance and all results computed during runahead mode are discarded. In fact, it is impossible to perfectly track invalid register values if runahead was initiated by an instruction cache miss, an instruction cache miss occurred during runahead, a load is dependent upon a store with an INV address (assumes that hardware is present to allow store to load forwarding during runahead), or if a branch outcome during runahead is dependent upon an INV register.


Leaving runahead

The register file state is restored from the checkpoint and the processor is redirected to the original faulting fetch address when the fetch that initiated runahead mode has been serviced.


Register file checkpoint options

The most obvious method of checkpointing the register file (RF) is to simply perform a flash copy to a shadow register file, or backup register file (BRF) when the processor enters runahead mode, then perform a flash copy from the BRF to the RF when normal operation resumes. There are simpler options available. One way to eliminate the flash copy operations is to write to both the BRF and RF during normal operation, read from only the RF during normal operation, and read/write only the BRF during runahead mode. An even more aggressive approach is to eliminate the BRF and rely upon the forwarding paths to provide modified values during runahead mode. Checkpointing is accomplished by disabling register file writes. Modified values during runahead mode can only be provided by the forwarding paths.


See also

* Rock processor *
Hardware scout Hardware scout is a technique that uses otherwise idle processor execution resources to perform prefetching during cache misses. When a thread is stalled by a cache miss, the processor pipeline checkpoints the register file, switches to runahead ...


References


Improving data cache performance by pre-executing instructions under a cache miss


* ttp://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=26557&arnumber=1183532 Runahead execution: an alternative to very large instruction windows for out-of-order processors Instruction processing