Transactional Memory
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
and
engineering Engineering is the use of scientific method, scientific principles to design and build machines, structures, and other items, including bridges, tunnels, roads, vehicles, and buildings. The discipline of engineering encompasses a broad rang ...
, transactional memory attempts to simplify
concurrent programming Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to: Law * Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea'' * Concurring opinion (also called a "concurrence"), ...
by allowing a group of load and store instructions to execute in an atomic way. It is a
concurrency control In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for Concurrent computing, concurrent operations a ...
mechanism analogous to
database transaction A database transaction symbolizes a unit of work, performed within a database management system (or similar system) against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally rep ...
s for controlling access to
shared memory In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between progr ...
in
concurrent computing Concurrent computing is a form of computing in which several computations are executed '' concurrently''—during overlapping time periods—instead of ''sequentially—''with one completing before the next starts. This is a property of a syst ...
. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.


Motivation

In concurrent programming, synchronization is required when parallel threads attempt to access a shared resource. Low-level thread synchronization constructs such as locks are pessimistic and prohibit threads that are outside a
critical section In concurrent programming, concurrent accesses to shared resources can lead to unexpected or erroneous behavior, so parts of the program where the shared resource is accessed need to be protected in ways that avoid the concurrent access. One way to ...
from making any changes. The process of applying and releasing locks often functions as additional overhead in workloads with little conflict among threads. Transactional memory provides
optimistic concurrency control Optimistic concurrency control (OCC), also known as optimistic locking, is a concurrency control method applied to transactional systems such as relational database management systems and software transactional memory. OCC assumes that multiple tran ...
by allowing threads to run in parallel with minimal interference. The goal of transactional memory systems is to transparently support regions of code marked as transactions by enforcing atomicity,
consistency In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
and isolation. A transaction is a collection of operations that can execute and commit changes as long as a conflict is not present. When a conflict is detected, a transaction will revert to its initial state (prior to any changes) and will rerun until all conflicts are removed. Before a successful commit, the outcome of any operation is purely speculative inside a transaction. In contrast to lock-based synchronization where operations are serialized to prevent data corruption, transactions allow for additional parallelism as long as few operations attempt to modify a shared resource. Since the programmer is not responsible for explicitly identifying locks or the order in which they are acquired, programs that utilize transactional memory cannot produce a
deadlock In concurrent computing, deadlock is any situation in which no member of some group of entities can proceed because each waits for another member, including itself, to take action, such as sending a message or, more commonly, releasing a lo ...
. With these constructs in place, transactional memory provides a high-level programming abstraction by allowing programmers to enclose their methods within transactional blocks. Correct implementations ensure that data cannot be shared between threads without going through a transaction and produce a serializable outcome. For example, code can be written as: def transfer_money(from_account, to_account, amount): """Transfer money from one account to another.""" with transaction(): from_account.balance -= amount to_account.balance += amount In the code, the block defined by "transaction" is guaranteed atomicity, consistency and isolation by the underlying transactional memory implementation and is transparent to the programmer. The variables within the transaction are protected from external conflicts, ensuring that either the correct amount is transferred or no action is taken at all. Note that concurrency related bugs are still possible in programs that use a large number of transactions, especially in software implementations where the library provided by the language is unable to enforce correct use. Bugs introduced through transactions can often be difficult to debug since breakpoints cannot be placed within a transaction. Transactional memory is limited in that it requires a shared-memory abstraction. Although transactional memory programs cannot produce a deadlock, programs may still suffer from a livelock or resource
starvation Starvation is a severe deficiency in caloric energy intake, below the level needed to maintain an organism's life. It is the most extreme form of malnutrition. In humans, prolonged starvation can cause permanent organ damage and eventually, dea ...
. For example, longer transactions may repeatedly revert in response to multiple smaller transactions, wasting both time and energy.


Hardware vs. software

The abstraction of atomicity in transactional memory requires a hardware mechanism to detect conflicts and undo any changes made to shared data. Hardware transactional memory systems may comprise modifications in processors, cache and bus protocol to support transactions. Speculative values in a transaction must be buffered and remain unseen by other threads until commit time. Large buffers are used to store speculative values while avoiding write propagation through the underlying
cache coherence In computer architecture, cache coherence is the uniformity of shared resource data that ends up stored in multiple local caches. When clients in a system maintain caches of a common memory resource, problems may arise with incoherent data, whi ...
protocol. Traditionally, buffers have been implemented using different structures within the memory hierarchy such as store queues or caches. Buffers further away from the processor, such as the L2 cache, can hold more speculative values (up to a few megabytes). The optimal size of a buffer is still under debate due to the limited use of transactions in commercial programs. In a cache implementation, the cache lines are generally augmented with read and write bits. When the hardware controller receives a request, the controller uses these bits to detect a conflict. If a serializability conflict is detected from a parallel transaction, then the speculative values are discarded. When caches are used, the system may introduce the risk of false conflicts due to the use of cache line granularity.
Load-link/store-conditional In computer science, load-linked/store-conditional (LL/SC), sometimes known as load-reserved/store-conditional (LR/SC), are a pair of instructions used in multithreading to achieve synchronization. Load-link returns the current value of a memory l ...
(LL/SC) offered by many
RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
processors can be viewed as the most basic transactional memory support; however, LL/SC usually operates on data that is the size of a native machine word, so only single-word transactions are supported. Although hardware transactional memory provides maximal performance compared to software alternatives, limited use has been seen at this time.
Software transactional memory In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. STM ...
provides transactional memory semantics in a software
runtime library In computer programming, a runtime library is a set of low-level routines used by a compiler to invoke some of the behaviors of a runtime environment, by inserting calls to the runtime library into compiled executable binary. The runtime enviro ...
or the programming language, and requires minimal hardware support (typically an atomic
compare and swap In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of th ...
operation, or equivalent). As the downside, software implementations usually come with a performance penalty, when compared to hardware solutions.
Hardware acceleration Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calcula ...
can reduce some of the overheads associated with software transactional memory. Owing to the more limited nature of hardware transactional memory (in current implementations), software using it may require fairly extensive tuning to fully benefit from it. For example, the dynamic memory allocator may have a significant influence on performance and likewise structure padding may affect performance (owing to cache alignment and false sharing issues); in the context of a virtual machine, various background threads may cause unexpected transaction aborts.


History

One of the earliest implementations of transactional memory was the gated store buffer used in
Transmeta Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software. Code Morphing ...
's
Crusoe Crusoe may refer to: Art, entertainment, and media * ''Crusoe'' (film), a 1989 film by Caleb Deschanel based on the novel ''Robinson Crusoe'' * ''Crusoe'' (TV series), a 2008 television series based on the novel ''Robinson Crusoe'' * Crusoe the ...
and
Efficeon The Efficeon processor is Transmeta's second-generation 256-bit VLIW design released in 2004 which employs a software engine Code Morphing Software (CMS) to convert code written for x86 processors to the native instruction set of the chip. Like i ...
processors. However, this was only used to facilitate speculative optimizations for binary translation, rather than any form of
speculative multithreading Thread Level Speculation (TLS), also known as Speculative Multithreading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal exec ...
, or exposing it directly to programmers. Azul Systems also implemented hardware transactional memory to accelerate their
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
appliances, but this was similarly hidden from outsiders.
Sun Microsystems Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, the ...
implemented hardware transactional memory and a limited form of speculative multithreading in its high-end
Rock processor Rock (or ROCK) was a multithreading, multicore, SPARC microprocessor under development at Sun Microsystems. Canceled in 2010, it was a separate project from the SPARC T-Series (CoolThreads/Niagara) family of processors. Rock aimed at higher per ...
. This implementation proved that it could be used for lock elision and more complex hybrid transactional memory systems, where transactions are handled with a combination of hardware and software. The Rock processor was canceled in 2009, just before the acquisition by
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
; while the actual products were never released, a number of prototype systems were available to researchers. In 2009,
AMD Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactur ...
proposed the
Advanced Synchronization Facility Advanced Synchronization Facility (ASF) is a proposed extension to the x86-64 instruction set architecture that adds hardware transactional memory support. It was introduced by AMD; the latest specification was dated March 2009. , it was still i ...
(ASF), a set of
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
extensions that provide a very limited form of hardware transactional memory support. The goal was to provide hardware primitives that could be used for higher-level synchronization, such as software transactional memory or lock-free algorithms. However, AMD has not announced whether ASF will be used in products, and if so, in what timeframe. More recently, IBM announced in 2011 that
Blue Gene/Q Blue Gene is an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption. The project created three generations of supercomputers, Blue Gene/L, Blue Gene/P, ...
had hardware support for both transactional memory and speculative multithreading. The transactional memory could be configured in two modes; the first is an unordered and single-version mode, where a write from one transaction causes a conflict with any transactions reading the same memory address. The second mode is for speculative multithreading, providing an ordered, multi-versioned transactional memory. Speculative threads can have different versions of the same memory address, and hardware implementation keeps track of the age for each thread. The younger threads can access data from older threads (but not the other way around), and writes to the same address are based on the thread order. In some cases, dependencies between threads can cause the younger versions to abort.
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
's
Transactional Synchronization Extensions Transactional Synchronization Extensions (TSX), also called Transactional Synchronization Extensions New Instructions (TSX-NI), is an extension to the x86 instruction set architecture (ISA) that adds hardware transactional memory support, speeding ...
(TSX) is available in some of the Skylake processors. It was earlier implemented in Haswell and Broadwell processors as well, but the implementations turned out both times to be defective and support for TSX was disabled. The TSX specification describes the transactional memory API for use by software developers, but withholds details on technical implementation.
ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer (RISC) instruction set architectures for computer processors, configured ...
has a similar extension. As of GCC 4.7, an experimental library for transactional memory is available which utilizes a hybrid implementation. The PyPy variant of Python also introduces transactional memory to the language.


Available implementations

* Hardware: ** Arm Transactional Memory Extension (TME) **
Blue Gene/Q Blue Gene is an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption. The project created three generations of supercomputers, Blue Gene/L, Blue Gene/P, ...
processor from IBM (Sequoia supercomputer) ** IBM zEnterprise EC12, the first commercial server to include transactional memory processor instructions ** Intel's
Transactional Synchronization Extensions Transactional Synchronization Extensions (TSX), also called Transactional Synchronization Extensions New Instructions (TSX-NI), is an extension to the x86 instruction set architecture (ISA) that adds hardware transactional memory support, speeding ...
(TSX), available in select Haswell-based processors and newer until be removed in
Comet Lake Comet Lake is Intel's codename for its 10th generation Core microprocessors. They are manufactured using Intel's third 14 nm Skylake process refinement, succeeding the Whiskey Lake U-series mobile processor and Coffee Lake desktop proces ...
** IBM
POWER8 POWER8 is a family of superscalar multi-core microprocessors based on the Power ISA, announced in August 2013 at the Hot Chips conference. The designs are available for licensing under the OpenPOWER Foundation, which is the first time for s ...
and 9, removed in
Power10 Power10 is a superscalar, multithreading, multi-core microprocessor family, based on the open source Power ISA, and announced in August 2020 at the Hot Chips conference; systems with Power10 CPUs. Generally available from September 2021 in th ...
( Power ISA v.3.1) **
Rock processor Rock (or ROCK) was a multithreading, multicore, SPARC microprocessor under development at Sun Microsystems. Canceled in 2010, it was a separate project from the SPARC T-Series (CoolThreads/Niagara) family of processors. Rock aimed at higher per ...
(canceled by
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
) * Software: ** Vega 2 from
Azul Systems Azul Systems, Inc. develops runtimes ( JDKs, JVMs) for executing Java-based applications. Founded in March 2002, Azul Systems is headquartered in Sunnyvale, California Products Azul Platform Prime (Formerly Zing) Azul produces ''Platform ...
** STM Monad in the
Glasgow Haskell Compiler The Glasgow Haskell Compiler (GHC) is an open-source native code compiler for the functional programming language Haskell. It provides a cross-platform environment for the writing and testing of Haskell code and it supports numerous extensions, ...
** STMX in
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fro ...
** Refs in
Clojure Clojure (, like ''closure'') is a dynamic and functional dialect of the Lisp programming language on the Java platform. Like other Lisp dialects, Clojure treats code as data and has a Lisp macro system. The current development process is comm ...
** gcc 4.7+ for C/C++ **
PyPy PyPy () is an implementation of the Python programming language. PyPy often runs faster than the standard implementation CPython because PyPy uses a just-in-time compiler. Most Python code runs well on PyPy except for code that depends on CPytho ...
** Part of the picotm Transaction Framework for C ** The TVar in concurrent-ruby, a concurrency library for Ruby


See also

*
Memory semantics In computing and parallel processing, memory semantics refers to the process logic used to control access to shared memory locations, or at a higher level to shared variables in the presence of multiple threads or processors. Memory semantics may ...
*
Automatic mutual exclusion Automatic mutual exclusion is a parallel computing programming paradigm in which threads are divided into atomic chunks, and the atomic execution of the chunks automatically parallelized using transactional memory. References See also * Bulk ...


References


Further reading

* * * Dave Dice, Yossi Lev, Mark Moir, Dan Nussbaum, and Marek Olszewski. (2009) "Early experience with a commercial hardware transactional memory implementation." Sun Microsystems technical report (60 pp.) SMLI TR-2009-180. A short version appeared at ASPLOS’09 * Amy Wang, Matthew Gaudet, Peng Wu, José Nelson Amaral, Martin Ohmacht, Christopher Barton, Raul Silvera, and Maged Michael.
Evaluation of Blue Gene/Q hardware support for transactional memories
. In Proceedings of the 21st international conference on Parallel architectures and compilation techniques, pp. 127–136. ACM, 2012. * Jacobi, C., Slegel, T., & Greiner, D. (2012, December).
Transactional memory architecture and implementation for IBM System z
. In Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on (pp. 25–36). IEEE. * Harold W. Cain, Maged M. Michael, Brad Frey, Cathy May, Derek Williams, and Hung Le. "Robust Architectural Support for Transactional Memory in the Power Architecture." In ISCA '13 Proceedings of the 40th Annual International Symposium on Computer Architecture, pp. 225–236, ACM, 2013.


External links

* Michael Neuling (IBM),
What's the deal with Hardware Transactional Memory!?!
introductory talk at
linux.conf.au linux.conf.au (often abbreviated as lca) is Australasia's regional Linux and Open Source conference. It is a roaming conference, held in a different Australian or New Zealand city every year, coordinated by Linux Australia and organised by lo ...
2014
Transactional Memory Online
Categorized bibliography about transactional memory {{DEFAULTSORT:Transactional Memory Concurrency control Transaction processing Computer memory