HOME

TheInfoList



OR:

In computing, a DBM is a
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
and
file format A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free. Some file formats ...
providing fast, single-keyed access to data. A key-value database from the original
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
, ''dbm'' is an early example of a
NoSQL A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
system.


History

The original ''dbm'' library and file format was a simple
database engine A database engine (or storage engine) is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own application ...
, originally written by
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
and released by
AT&T AT&T Inc. is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the world's largest telecommunications company by revenue and the third largest provider of mobile tel ...
in 1979. The name is a
three letter acronym A three-letter acronym (TLA), or three-letter abbreviation, is an abbreviation consisting of three letters. These are usually the initial letters of the words of the phrase abbreviated, and are written in capital letters (upper case); three-lette ...
for ''DataBase Manager'', and can also refer to the family of database engines with APIs and features derived from the original ''dbm''. The ''dbm'' library stores arbitrary data by use of a single key (a
primary key In the relational model of databases, a primary key is a ''specific choice'' of a ''minimal'' set of attributes (Column (database), columns) that uniquely specify a tuple (Row (database), row) in a Relation (database), relation (Table (database), t ...
) in fixed-size
buckets A bucket is typically a watertight, vertical cylinder or truncated cone or square, with an open top and a flat bottom, attached to a semicircular carrying handle called the ''bail''. A bucket is usually an open-top container. In contrast, a ...
and uses
hashing Hash, hashes, hash mark, or hashing may refer to: Substances * Hash (food), a coarse mixture of ingredients * Hash, a nickname for hashish, a cannabis product Hash mark * Hash mark (sports), a marking on hockey rinks and gridiron football fiel ...
techniques to enable fast retrieval of the data by key. The hashing scheme used is a form of
extendible hashing Extendible hashing is a type of hash system which treats a hash as a bit string and uses a trie for bucket lookup. Because of the hierarchical nature of the system, re-hashing is an incremental operation (done one bucket at a time, as needed). Thi ...
, so that the hashing scheme expands as new buckets are added to the database, meaning that, when nearly empty, the database starts with one bucket, which is then split when it becomes full. The two resulting child buckets will themselves split when they become full, so the database grows as keys are added. The ''dbm'' library and its derivatives are pre-
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
s they manage
associative array In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathematical terms an as ...
s, implemented as on-disk
hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', als ...
s. In practice, they can offer a more practical solution for high-speed storage accessed by key, as they do not require the overhead of connecting and preparing queries. This is balanced by the fact that they can generally only be opened for writing by a single process at a time. An agent
daemon Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and mythology and of later Hellenistic religion and philosophy. The word ...
can handle requests from multiple processes, but introduces IPC overhead.


Implementations

The original AT&T ''dbm'' library has been replaced by its many successor implementations. Notable examples include: * ''ndbm'' ("new dbm"), based on the original dbm with some new features.
GDBM
("GNU dbm"),
GNU GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
rewrite of the library implementing ''ndbm'' features and its own interface. Also provides new features like crash tolerance for guaranteeing data consistency. * ''sdbm'' ("small dbm"), a
public domain The public domain (PD) consists of all the creative work A creative work is a manifestation of creative effort including fine artwork (sculpture, paintings, drawing, sketching, performance art), dance, writing (literature), filmmaking, ...
rewrite of ''dbm''. It is a part of the standard distribution for
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
and is available as an external library for
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
. * ''qdbm'' ("Quick Database Manager"), a higher-performance ''dbm'' employing many of the same techniques as Tokyo/Kyoto Cabinet. Written by the same author before they moved on to the cabinets. * ''tdb'' ("Trivial Database"), a simple database used by
Samba Samba (), also known as samba urbano carioca (''urban Carioca samba'') or simply samba carioca (''Carioca samba''), is a Brazilian music genre that originated in the Afro-Brazilian communities of Rio de Janeiro in the early 20th century. Havin ...
that supports multiple writers. Has a gdbm-based API. *
Berkeley DB Berkeley DB (BDB) is an unmaintained embedded database software library for key/value data, historically significant in open source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitr ...
, 1991 replacement of ndbm by
Sleepycat Software Sleepycat Software, Inc. was the software company primarily responsible for maintaining the Berkeley DB packages from 1996 to 2006. Berkeley DB is freely-licensed database software originally developed at the University of California, Berkeley f ...
(now
Oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
) created to get around the AT&T Unix copyright on
BSD The Berkeley Software Distribution or Berkeley Standard Distribution (BSD) is a discontinued operating system based on Research Unix, developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berk ...
. It features many extensions like parallelism, transactional control, hashing, and B tree storage. * LMDB:
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
memory-mapped B+ tree implementation in C with a Berkeley-style API. The following databases are dbm-inspired, but they do not directly provide a dbm interface, even though it would be trivial to wrap one: * cdb ("constant database"), database by Daniel J. Bernstein, database files can only be created and read, but never be modified * Tkrzw, an Apache 2.0 licensed successor to Kyoto Cabinet and Tokyo Cabinet *
WiredTiger WiredTiger is a NoSQL, Open Source extensible platform for data management. It is released under version 2 or 3 of the GNU General Public License. WiredTiger uses MultiVersion Concurrency Control ( MVCC) architecture. MongoDB acquired WiredTiger ...
: database with traditional row-oriented and column-oriented storage.


Availability

As of 2001, the ''ndbm'' implementation of DBM was standard on Solaris and IRIX, whereas ''gdbm'' is ubiquitous on
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
. The Berkeley DB implementations were standard on some free operating systems. After a change of licensing of the Berkeley DB to
GNU AGPL The GNU Affero General Public License (GNU AGPL) is a free, copyleft license published by the Free Software Foundation in November 2007, and based on the GNU General Public License, version 3 and the Affero General Public License. The Free So ...
in 2013, projects like
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
have moved to LMDB.


Reliability

A 2018
AFL AFL may refer to: Sports * American Football League (AFL), a name shared by several separate and unrelated professional American football leagues: ** American Football League (1926) (a.k.a. "AFL I"), first rival of the National Football Leagu ...
fuzzing In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions ...
test against many DBM-family databases exposed many problems in implementations when it comes to corrupt or invalid database files. Only ''freecdb'' by Daniel J. Bernstein showed no crashes. The authors of gdbm, tdb, and lmdb were prompt to respond. Berkeley DB fell behind due to the sheer amount of other issues; the fixes would be irrelevant to open-source software users due to the licensing change locking them back on an old version.


See also

*
Embedded database An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes: * database systems with differing ...
*
Flat file database A flat-file database is a database stored in a file called a flat file. Records follow a uniform format, and there are no structures for indexing or recognizing relationships between records. The file is simple. A flat file can be a plain ...
*
ISAM ISAM (an acronym for indexed sequential access method) is a method for creating, maintaining, and manipulating computer files of data so that records can be retrieved sequentially or randomly by one or more keys. Indexes of key fields are mainta ...
* Key-value database *
Mobile database Mobile computing devices (e.g., smartphones and PDAs) store and share data over a mobile network, or a database which is actually stored by the mobile device. This could be a list of contacts, price information, distance travelled, or any other in ...
*
NoSQL A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
*
Semaphore (programming) In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple threads and avoid critical section problems in a concurrent system such as a multitasking operating system. Semaphores ...


References


Bibliography

* * *
SDBM library
@
Apache The Apache () are a group of culturally related Native American tribes in the Southwestern United States, which include the Chiricahua, Jicarilla, Lipan, Mescalero, MimbreƱo, Ndendahe (Bedonkohe or Mogollon and Nednhi or CarrizaleƱo an ...
* *{{cite web , last1=Olson , first1=Michael A. , last2=Bostic , first2=Keith , last3=Seltzer , first3=Margo , year=1999 , title=Berkeley DB , work=Proceedings of the FREENIX Track:1999 USENIX Annual Technical Conference , url=http://www.usenix.org/events/usenix99/full_papers/olson/olson.pdf Database engines Free database management systems Structured storage Embedded databases