Memory bandwidth
   HOME

TheInfoList



OR:

Memory bandwidth is the rate at which data can be read from or stored into a
semiconductor memory Semiconductor memory is a digital electronic semiconductor device used for digital data storage, such as computer memory. It typically refers to devices in which data is stored within metal–oxide–semiconductor (MOS) memory cells on a si ...
by a
processor Processor may refer to: Computing Hardware * Processor (computing) **Central processing unit (CPU), the hardware within a computer that executes a program *** Microprocessor, a central processing unit contained on a single integrated circuit (I ...
. Memory bandwidth is usually expressed in units of bytes/second, though this can vary for systems with natural data sizes that are not a multiple of the commonly used 8-bit bytes. Memory bandwidth that is advertised for a given memory or system is usually the maximum theoretical bandwidth. In practice the observed memory bandwidth will be less than (and is guaranteed not to exceed) the advertised bandwidth. A variety of
computer benchmark In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. The ...
s exist to measure sustained memory bandwidth using a variety of access patterns. These are intended to provide insight into the memory bandwidth that a system should sustain on various classes of real applications.


Measurement conventions

There are three different conventions for defining the quantity of data transferred in the numerator of "bytes/second": #The bcopy convention: counts the amount of data copied from one location in memory to another location per unit time. For example, copying 1 million bytes from one location in memory to another location in memory in one second would be counted as 1 million bytes per second. The bcopy convention is self-consistent, but is not easily extended to cover cases with more complex access patterns, for example three reads and one write. #The Stream convention: sums the amount of data that the application code explicitly reads plus the amount of data that the application code explicitly writes.STREAM Benchmark FAQ: Counting Bytes and FLOPS: http://www.cs.virginia.edu/stream/ref.html#counting Using the previous 1 million byte copy example, the STREAM bandwidth would be counted as 1 million bytes read plus 1 million bytes written in one second, for a total of 2 million bytes per second. The STREAM convention is most directly tied to the user code, but may not count all the data traffic that the hardware is actually required to perform. #The hardware convention: counts the actual amount of data read or written by the hardware, whether the data motion was explicitly requested by the user code or not. Using the same 1 million byte copy example, the ''hardware'' bandwidth on computer systems with a write allocate cache policy would include an additional 1 million bytes of traffic because the hardware reads the target array from memory into cache before performing the stores. This gives a total of 3 million bytes per second actually transferred by the hardware. The hardware convention is most directly tied to the hardware, but may not represent the minimum amount of data traffic required to implement the user's code. ::For example, some computer systems have the ability to avoid write allocate traffic using special instructions, leading to the possibility of misleading comparisons of bandwidth based on different amounts of data traffic performed.


Bandwidth computation and nomenclature

The nomenclature differs across memory technologies, but for commodity
DDR SDRAM Double Data Rate Synchronous Dynamic Random-Access Memory (DDR SDRAM) is a double data rate (DDR) synchronous dynamic random-access memory (SDRAM) class of memory integrated circuits used in computers. DDR SDRAM, also retroactively called DDR1 ...
,
DDR2 SDRAM Double Data Rate 2 Synchronous Dynamic Random-Access Memory (DDR2 SDRAM) is a double data rate (DDR) synchronous dynamic random-access memory (SDRAM) interface. It superseded the original DDR SDRAM specification, and was itself superseded by DDR ...
, and
DDR3 SDRAM Double Data Rate 3 Synchronous Dynamic Random-Access Memory (DDR3 SDRAM) is a type of synchronous dynamic random-access memory (SDRAM) with a high bandwidth (" double data rate") interface, and has been in use since 2007. It is the higher-spe ...
memory, the total bandwidth is the product of: * Base DRAM clock frequency * Number of data transfers per clock: Two, in the case of "double data rate" (DDR, DDR2, DDR3, DDR4) memory. * Memory bus (interface) width: Each DDR, DDR2, or DDR3 memory interface is 64 bits wide. Those 64 bits are sometimes referred to as a "line." * Number of interfaces: Modern personal computers typically use two memory interfaces (
dual-channel In the fields of digital electronics and computer hardware, multi-channel memory architecture is a technology that increases the data transfer rate between the DRAM memory and the memory controller by adding more channels of communication between ...
mode) for an effective 128-bit bus width. For example, a computer with dual-channel memory and one DDR2-800 module per channel running at 400 MHz would have a theoretical maximum memory bandwidth of: :400,000,000 clocks per second × 2 lines per clock × 64 bits per line × 2 interfaces = :102,400,000,000 (102.4 billion) bits per second (in bytes, 12,800 MB/s or 12.8 GB/s) This theoretical maximum memory bandwidth is referred to as the "burst rate," which may not be sustainable. The naming convention for DDR, DDR2 and DDR3 modules specifies either a maximum speed (e.g., DDR2-800) or a maximum bandwidth (e.g., PC2-6400). The speed rating (800) is not the maximum clock speed, but twice that (because of the doubled data rate). The specified bandwidth (6400) is the maximum megabytes transferred per second using a 64-bit width. In a dual-channel mode configuration, this is effectively a 128-bit width. Thus, the memory configuration in the example can be simplified as: two DDR2-800 modules running in dual-channel mode. Two memory interfaces per module is a common configuration for PC system memory, but single-channel configurations are common in older, low-end, or low-power devices. Some personal computers and most modern graphics cards use more than two memory interfaces (e.g., four for Intel's
LGA 2011 LGA 2011, also called ''Socket R'', is a CPU socket by Intel released on November 14, 2011. It launched along with LGA 1356 to replace its predecessor, LGA 1366 (Socket B) and LGA 1567. While LGA 1356 was designed for dual-processor or ...
platform and the NVIDIA GeForce GTX 980). High-performance graphics cards running many interfaces in parallel can attain very high total memory bus width (e.g., 384 bits in the NVIDIA GeForce GTX TITAN and 512 bits in the AMD Radeon R9 290X using six and eight 64-bit interfaces respectively).


ECC bits

In systems with error-correcting memory (ECC), the additional width of the interfaces (typically 72 rather than 64 bits) is not counted in bandwidth specifications because the extra bits are unavailable to store user data. ECC bits are better thought of as part of the memory hardware rather than as information stored in that hardware.


See also

*
CAS latency Column Address Strobe (CAS) latency, or CL, is the delay in clock cycles between the READ command and the moment data is available. In asynchronous DRAM, the interval is specified in nanoseconds (absolute time). In synchronous DRAM, the interval ...
*
Dynamic random-access memory Dynamic random-access memory (dynamic RAM or DRAM) is a type of random-access semiconductor memory that stores each bit of data in a memory cell, usually consisting of a tiny capacitor and a transistor, both typically based on metal-ox ...
*
List of device bandwidths This is a list of interface bit rates, is a measure of information transfer rates, or digital bandwidth capacity, at which digital interfaces in a computer or network can communicate over various kinds of buses and channels. The distinction can ...
*
Memory latency ''Memory latency'' is the time (the latency) between initiating a request for a byte or word in memory until it is retrieved by a processor. If the data are not in the processor's cache, it takes longer to obtain them, as the processor will hav ...
*
Memory timings Memory timings or RAM timings describe the timing information of a memory module. Due to the inherent qualities of VLSI and microelectronics, memory chips require time to fully execute commands. Executing commands too quickly will result in data ...
*
Random-access memory Random-access memory (RAM; ) is a form of computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written in almost the ...


References

{{Reflist


General


BSS Random Access Benchmark
Performance Evaluation and Optimization of Random Memory Access on Multicores with High Productivity a
ACM/IEEE HiPC 2010


External links


STREAM Benchmark

Memory Buy the Numbers
Computer memory