Lightning Memory-Mapped Database
   HOME

TheInfoList



OR:

Lightning Memory-Mapped Database (LMDB) is a
software library In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subr ...
that provides an embedded transactional database in the form of a key-value store. LMDB is written in C with API bindings for several
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s. LMDB stores arbitrary key/data pairs as byte arrays, has a range-based search capability, supports multiple data items for a single key and has a special mode for appending records (MDB_APPEND) without checking for consistency.LMDB Reference Guide
. Retrieved on 2014-10-19
LMDB is not a
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
, it is strictly a key-value store like
Berkeley DB Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitr ...
and
dbm DBM or dbm may refer to: Science and technology * dBm, a unit for power measurement * DBM (computing), family of key-value database engines including dbm, ndbm, gdbm, and Berkeley DB * Database Manager (DBM), a component of 1987's ''Extended Edi ...
. LMDB may also be used concurrently in a multi-threaded or multi-processing environment, with read performance scaling linearly by design. LMDB databases may have only one writer at a time, however unlike many similar key-value databases, write transactions do ''not'' block readers, nor do readers block writers. LMDB is also unusual in that multiple applications on the same system may simultaneously open and use the same LMDB store, as a means to scale up performance. Also, LMDB does not require a transaction log (thereby increasing write performance by not needing to write data twice) because it maintains data integrity inherently by design.


History

LMDB's design was first discussed in a 2009 post to the
OpenLDAP OpenLDAP is a free, open-source implementation of the Lightweight Directory Access Protocol (LDAP) developed by the OpenLDAP Project. It is released under its own BSD-style license called the OpenLDAP Public License. LDAP is a platform-independe ...
developer mailing list, in the context of exploring solutions to the cache management difficulty caused by the project's dependence on
Berkeley DB Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitr ...
. A specific goal was to replace the multiple layers of configuration and caching inherent to Berkeley DB's design with a single, automatically managed cache under the control of the host
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
. Development subsequently began, initially as a
fork In cutlery or kitchenware, a fork (from la, furca 'pitchfork') is a utensil, now usually made of metal, whose long handle terminates in a head that branches into several narrow and often slightly curved tines with which one can spear foods ei ...
of a similar implementation from the OpenBSD ldapd project. The first publicly available version appeared in the OpenLDAP source repository in June 2011. The project was known as MDB until November 2012, after which it was renamed in order to avoid conflicts with existing software.


Technical description

Internally LMDB uses B+ tree data structures. The efficiency of its design and small footprint had the unintended side-effect of providing good write
performance A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Management science In the work place ...
as well. LMDB has an
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
similar to
Berkeley DB Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitr ...
and
dbm DBM or dbm may refer to: Science and technology * dBm, a unit for power measurement * DBM (computing), family of key-value database engines including dbm, ndbm, gdbm, and Berkeley DB * Database Manager (DBM), a component of 1987's ''Extended Edi ...
. LMDB treats the computer's memory as a single address space, shared across multiple processes or threads using
shared memory In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between progr ...
with
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
semantics (known historically as a
single-level store Single-level storage (SLS) or single-level memory is a computer storage term which has had two meanings. The two meanings are related in that in both, pages of memory may be in primary storage (RAM) or in secondary storage (disk), and that the p ...
). Due to most former modern computing architectures having 32-bit memory address space limitations, which imposes a hard limit of 4 GB on the size of any database using such techniques, the effectiveness of the technique of directly mapping a database into a
single-level store Single-level storage (SLS) or single-level memory is a computer storage term which has had two meanings. The two meanings are related in that in both, pages of memory may be in primary storage (RAM) or in secondary storage (disk), and that the p ...
was strictly limited. However, today's 64-bit processors now mostly implement 48-bit address spaces, giving access to 47-bit addresses or 128 terabytes of database size, making databases using shared memory useful once again in real-world applications. Specific noteworthy technical features of LMDB are: * Its use of B+ tree. With an LMDB instance being in shared memory and the B+ tree block size being set to the OS page size, access to an LMDB store is extremely memory efficient * New data is written without overwriting or moving existing data. This results in guaranteed data integrity and
reliability Reliability, reliable, or unreliable may refer to: Science, technology, and mathematics Computing * Data reliability (disambiguation), a property of some disk arrays in computer storage * High availability * Reliability (computer networking), a ...
without requiring transaction logs or cleanup services. * The provision of a unique append-write mode (MDB_APPEND) which is implemented by allowing the new record to be added directly to the end of the B+ tree. This reduces the number of reads and write page operations, resulting in greatly-increased performance but requiring that the programmer is responsible for ensuring keys are already in sorted order when storing into the DB. *
Copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
semantics help ensure
data integrity Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire Information Lifecycle Management, life-cycle and is a critical aspect to the design, implementation, and usage of any system that stores, proc ...
as well as providing transactional guarantees and simultaneous access by readers without requiring any locking, even by the current writer. New memory pages required internally during data modifications are allocated through copy-on-write semantics by the underlying OS: the LMDB library itself never actually modifies older data being accessed by readers because it simply cannot do so: any shared-memory updates ''automatically'' create a completely independent copy of the memory-page being written to. * As LMDB is memory-mapped, it can return ''direct'' pointers to memory addresses of keys and values through its API, thereby avoiding unnecessary and expensive copying of memory. This results in greatly-increased performance (especially when the values stored are extremely large), and expands the potential use cases for LMDB. * LMDB also tracks unused memory pages, using a B+ tree to keep track of pages freed (no longer needed) during transactions. By tracking unused pages the need for garbage-collection (and a garbage collection phase which would consume CPU cycles) is completely avoided. Transactions which need new pages are first given pages from this unused free pages tree; only after these are used up will it expand into formerly unused areas of the underlying memory-mapped file. On a modern filesystem with
sparse file In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when the file itself is partially empty. This is achieved by writing brief information (metadata) ''representing'' the empty block ...
support this helps minimise actual disk usage. The file format of LMDB is, unlike that of
Berkeley DB Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitr ...
, architecture-dependent. This means that a conversion must be done before moving a database from a 32-bit machine to a 64-bit machine, or between computers of differing
endianness In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most sig ...
.


Concurrency

LMDB employs
multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. Description W ...
(MVCC) and allows multiple threads within multiple processes to coordinate simultaneous access to a database. Readers scale linearly by design . While write transactions are globally serialized via a
mutex In computer science, a lock or mutex (from mutual exclusion) is a synchronization primitive: a mechanism that enforces limits on access to a resource when there are many threads of execution. A lock is designed to enforce a mutual exclusion concur ...
, read-only transactions operate in parallel, including in the presence of a write transaction, and are entirely wait free except for the first read-only transaction on a thread. Each thread reading from a database gains ownership of an element in a shared memory array, which it may update to indicate when it is within a transaction. Writers scan the array to determine the oldest database version the transaction must preserve, without requiring direct synchronization with active readers.


Performance

In 2011 Google published software which allowed users to generate micro-benchmarks comparing
LevelDB LevelDB is an open-source on-disk key-value store written by Google fellows Jeffrey Dean and Sanjay Ghemawat. Inspired by Bigtable, LevelDB is hosted on GitHub under the New BSD License and has been ported to a variety of Unix-based systems, ma ...
's performance to
SQLite SQLite (, ) is a database engine written in the C programming language. It is not a standalone app; rather, it is a library that software developers embed in their apps. As such, it belongs to the family of embedded databases. It is the m ...
and
Kyoto Cabinet Tkrzw is a library of routines for managing key-value databases. Tokyo Cabinet was sponsored by the Japanese social networking site Mixi, and was a multithreaded embedded database manager and was announced by its authors as "a modern implementa ...
in different scenarios. In 2012 Symas added support for LMDB and Berkeley DB and made the updated benchmarking software publicly available. The resulting benchmarks showed that LMDB outperformed all other databases in read and batch write operations. SQLite with LMDB excelled on write operations, and particularly so on synchronous/transactional writes. The benchmarks showed the underlying filesystem as having a big influence on performance. JFS with an external journal performs well, especially compared to other modern systems like
Btrfs Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or simply by spelling it out) is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (not to be confused ...
and
ZFS ZFS (previously: Zettabyte File System) is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an ope ...
. Zimbra has tested back-mdb vs back-hdb performance in OpenLDAP, with LMDB clearly outperforming the BDB based back-hdb. Many other OpenLDAP users have observed similar benefits. Since the initial benchmarking work done in 2012, multiple follow-on tests have been conducted with additional database engines for both in-memory and on-disk workloads characterizing the performance across multiple CPUs and record sizes. These tests show that LMDB performance is unmatched on all in-memory workloads, and excels in all disk-bound read workloads, as well as disk-bound write workloads using large record sizes. The benchmark driver code was subsequently published on GitHub and further expanded in database coverage.


Reliability

LMDB was designed from the start to resist data loss in the face of system and application crashes. Its
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
approach never overwrites currently-in-use data. Avoiding overwrites means the structure on disk/storage is always valid, so application or system crashes can never leave the database in a corrupted state. In its default mode, at worst a crash can lose data from the last not-yet-committed write transaction. Even with all asynchronous modes enabled, it is only an OS catastrophic failure or hardware power-loss event rather than merely an application crash that could potentially result in any data corruption. Two academic papers from the USENIX OSDI Symposium covered failure modes of DB engines (including LMDB) under a sudden power loss or system crash. The paper from Pillai et al., did not find any failure in LMDB that would occur in the real-world file systems considered; the single failure identified by the study in LMDB only relates to hypothetical file systems. The Mai Zheng et al. paper claims to point out failures in LMDB, but the conclusion depends on whether fsync or fdatasync is utilised. Using fsync ameliorates the problem. Selection of fsync or fdatasync is a compile-time switch which is not the default behavior in current Linux builds of LMDB, but is the default on macOS, *BSD, Android, and Windows. Default Linux builds of LMDB are therefore the only ones vulnerable to the problem discovered by the zhengmai researchers however LMDB may simply be rebuilt by Linux users to utilise fsync instead. When provided with a corrupt database, such as one produced by
fuzzing In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions ...
, LMDB may crash. LMDB's author considers the case unlikely to be concerning, but has nevertheless produced a partial fix in a separate branch.


Open source license

In June 2013,
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
changed the license of
Berkeley DB Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitr ...
(a related project) from the Sleepycat license to the
Affero General Public License The Affero General Public License (Affero GPL and informally Affero License) is a free software license. The first version of the Affero General Public License (AGPLv1), was published by Affero, Inc. in March 2002, and based on the GNU General Pu ...
, thus restricting its use in a wide variety of applications. This caused the
Debian project Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of Deb ...
to exclude the library from 6.0 onwards. It was also criticized that this license is not friendly to commercial redistributors. The discussion was sparked over whether the same licensing change could happen to LMDB. Author Howard Chu made clear that LMDB is part of the OpenLDAP project, which had its BSD style license before he joined, and it will stay like it. No copyright is transferred to anybody by checking in, which would make a similar move like Oracle's impossible. The Berkeley DB license issue has caused major Linux distributions such as
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
to completely phase out their use of Berkeley DB, with a preference for LMDB.


API and uses

There are wrappers for several programming languages, such as C++, Java, Python, Lua, Go, Ruby, Objective C, Javascript, C#, Perl, PHP, Tcl and Common Lisp. A complete list of wrappers may be found on the main web site. Howard Chu ported
SQLite SQLite (, ) is a database engine written in the C programming language. It is not a standalone app; rather, it is a library that software developers embed in their apps. As such, it belongs to the family of embedded databases. It is the m ...
3.7.7.1 to use LMDB instead of its original
B-tree In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
code, calling the end result SQLightning. One cited insert test of 1000 records was 20 times faster (than the original SQLite with its B-Tree implementation). LMDB is available as a backing store for other open source projects including Cyrus SASL, Heimdal Kerberos, and OpenDKIM. It is also available in some other NoSQL projects like MemcacheDB and Mapkeeper. LMDB was used to make the in-memory store
Redis Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, su ...
persist data on disk. The existing back-end in
Redis Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, su ...
showed pathological behaviour in rare cases, and a replacement was sought. The baroque API of LMDB was criticized though, forcing a lot of coding to get simple things done. However, its performance and reliability during testing was considerably better than the alternative back-end stores that were tried. An independent third-party software developer utilised the
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
bindings to LMDB in a high-performance environment and published, on the technical news site
Slashdot ''Slashdot'' (sometimes abbreviated as ''/.'') is a social news website that originally advertised itself as "News for Nerds. Stuff that Matters". It features news stories concerning science, technology, and politics that are submitted and evalu ...
, how the system managed to successfully sustain 200,000 simultaneous read, write and delete operations per second (a total of 600,000 database operations per second). An up-to-date list of applications using LMDB is maintained on the main web site.


Application support

Many popular
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
projects distribute or include support for LMDB, often as the primary or sole storage mechanism. * The
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
,
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed mostly of free and open-source software. Ubuntu is officially released in three editions: '' Desktop'', ''Server'', and ''Core'' for Internet of things devices and robots. All ...
,
Fedora A fedora () is a hat with a soft brim and indented crown.Kilgour, Ruth Edwards (1958). ''A Pageant of Hats Ancient and Modern''. R. M. McBride Company. It is typically creased lengthwise down the crown and "pinched" near the front on both sides ...
, and
OpenSuSE openSUSE () is a free and open-source software, free and open source RPM Package Manager, RPM-based Linux distribution developed by the openSUSE project. The initial release of the community project was a beta version of SUSE Linux 10.0. Addi ...
operating systems. *
OpenLDAP OpenLDAP is a free, open-source implementation of the Lightweight Directory Access Protocol (LDAP) developed by the OpenLDAP Project. It is released under its own BSD-style license called the OpenLDAP Public License. LDAP is a platform-independe ...
for which LMDB was originally developed via . * Postfix via the adapter. *
PowerDNS PowerDNS is a DNS server program, written in C++ and licensed under the GPL. It runs on most Unix derivatives. PowerDNS features a large number of different ''backends'' ranging from simple BIND style zonefiles to relational databases and lo ...
, a DNS server. *
CFEngine CFEngine is an open-source configuration management system, written by Mark Burgess. Its primary function is to provide automated configuration and maintenance of large-scale computer systems, including the unified management of servers, desk ...
uses LMDB by default since version of 3.6.0. *
Shopify Shopify Inc. is a Canadian multinational e-commerce company headquartered in Ottawa, Ontario. Shopify is the name of its proprietary e-commerce platform for online stores and retail point-of-sale systems. The Shopify platform offers online ret ...
use LMDB in their SkyDB system. *
Knot DNS Knot DNS is an open-source authoritative-only server for the Domain Name System. It was created from scratch and is actively developed by CZ.NIC, the .CZ domain registry. The purpose of this project is to supply an alternative open-source impl ...
a high performance DNS server. *
Monero Monero (; Abbreviation: XMR) is a decentralized cryptocurrency. It uses a public distributed ledger with privacy-enhancing technologies that obfuscate transactions to achieve anonymity and fungibility. Observers cannot decipher addresses t ...
an open source cryptocurrency created in April 2014 that focuses on privacy, decentralisation and scalability. *
Enduro/X Enduro/X is an open-source middleware platform for distributed transaction processing. It is built on proven APIs such as X/Open group's XATMI and XA. The platform is designed for building real-time microservices based applications with a cl ...
middleware uses LMDB for optional XATMI Microservices (SOA) cache. So that for first request the actual service is invoked, in next request client process reads saved result directly from LMDB. *
Samba Samba (), also known as samba urbano carioca (''urban Carioca samba'') or simply samba carioca (''Carioca samba''), is a Brazilian music genre that originated in the Afro-Brazilian communities of Rio de Janeiro in the early 20th century. Havin ...
Active Directory Domain Controller * Nano a peer-to-peer, open source cryptocurrency created in 2015 that prioritizes fast and fee-less transactions.


Technical reviews of LMDB

LMDB makes novel use of well-known computer science techniques such as
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
semantics and B+ trees to provide atomicity and reliability guarantees as well as performance that can be hard to accept, given the library's relative simplicity and that no other similar key-value store database offers the same guarantees or overall performance, even though the authors ''explicitly state'' in presentations that LMDB is read-optimised not write-optimised. Additionally, as LMDB was primarily developed for use in
OpenLDAP OpenLDAP is a free, open-source implementation of the Lightweight Directory Access Protocol (LDAP) developed by the OpenLDAP Project. It is released under its own BSD-style license called the OpenLDAP Public License. LDAP is a platform-independe ...
its developers are focused mainly on development and maintenance of OpenLDAP, not on LMDB per se. The developers limited time spent presenting the first benchmark results was therefore criticized as not stating limitations, and for giving a "silver bullet impression" not adequate to address an engineers attitude ''(it has to be pointed out that the concerns raised however were later adequately addressed to the reviewer's satisfaction by the key developer behind LMDB.)'' The presentation did spark other database developers dissecting the code in-depth to understand how and why it works. Reviews run from brief to in-depth. Database developer Oren Eini wrote a 12-part series of articles on his analysis of LMDB, beginning July 9, 2013. The conclusion was in the lines of "impressive codebase ... dearly needs some love", mainly because of too long methods and code duplication. This review, conducted by a .NET developer with no former experience of C, concluded on August 22, 2013 with "beyond my issues with the code, the implementation is really quite brilliant. The way LMDB manages to pack so much functionality by not doing things is quite impressive... I learned quite a lot from the project, and it has been frustrating, annoying and fascinating experience" Multiple other reviews cover LMDB in various languages including Chinese.


References

{{reflist C (programming language) libraries Embedded databases Free software programmed in C Key-value databases NoSQL Structured storage