Berkeley DB XML
   HOME

TheInfoList



OR:

Berkeley DB (BDB) is an embedded database
software Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications. The history of software is closely tied to the development of digital comput ...
library A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
for key/value data, historically significant in
open-source software Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
. Berkeley DB is written in C with API bindings for many other
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s. BDB stores arbitrary key/data pairs as byte arrays and supports multiple data items for a single key. Berkeley DB is not a
relational database A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
, although it has database features including
database transaction A database transaction symbolizes a unit of work, performed within a database management system (or similar system) against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally rep ...
s,
multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a non-locking concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. ...
and write-ahead logging. BDB runs on a wide variety of
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s, including most
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
and
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
systems, and
real-time operating system A real-time operating system (RTOS) is an operating system (OS) for real-time computing applications that processes data and events that have critically defined time constraints. A RTOS is distinct from a time-sharing operating system, such as Unix ...
s. BDB was commercially supported and developed by Sleepycat Software from 1996 to 2006. Sleepycat Software was acquired by
Oracle Corporation Oracle Corporation is an American Multinational corporation, multinational computer technology company headquartered in Austin, Texas. Co-founded in 1977 in Santa Clara, California, by Larry Ellison, who remains executive chairman, Oracle was ...
in February 2006, who continued to develop and sell the C Berkeley DB library. In 2013 Oracle re-licensed BDB under the AGPL license and released new versions until May 2020. Bloomberg L.P. continues to develop a fork of the 2013 version of BDB within their Comdb2 database, under the original Sleepycat permissive license.


Origin

Berkeley DB originated at the
University of California, Berkeley The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California), is a Public university, public Land-grant university, land-grant research university in Berkeley, California, United States. Founded in 1868 and named after t ...
as part of
BSD The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
, Berkeley's version of the
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
operating system. After 4.3BSD (1986), the BSD developers attempted to remove or replace all code originating in the original
AT&T AT&T Inc., an abbreviation for its predecessor's former name, the American Telephone and Telegraph Company, is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the w ...
Unix from which BSD was derived. In doing so, they needed to rewrite the Unix database package. Seltzer and Yigit created a new database, unencumbered by any AT&T patents: an on-disk
hash table In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
that outperformed the existing dbm libraries. Berkeley DB itself was first released in 1991 and later included with 4.4BSD. In 1996
Netscape Netscape Communications Corporation (originally Mosaic Communications Corporation) was an American independent computer services company with headquarters in Mountain View, California, and then Dulles, Virginia. Its Netscape web browser was o ...
requested that the authors of Berkeley DB improve and extend the library, then at version 1.86, to suit Netscape's requirements for an
LDAP The Lightweight Directory Access Protocol (LDAP ) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed Directory service, directory information services over an Internet Protocol (IP) networ ...
server and for use in the Netscape browser. That request led to the creation of Sleepycat Software. This company was acquired by
Oracle Corporation Oracle Corporation is an American Multinational corporation, multinational computer technology company headquartered in Austin, Texas. Co-founded in 1977 in Santa Clara, California, by Larry Ellison, who remains executive chairman, Oracle was ...
in February 2006. Berkeley DB 1.x releases focused on managing key/value data storage and are referred to as "Data Store" (DS). The 2.x releases added a locking system enabling concurrent access to data. This is what is known as "Concurrent Data Store" (CDS). The 3.x releases added a logging system for transactions and recovery, called "Transactional Data Store" (TDS). The 4.x releases added the ability to replicate log records and create a distributed highly available single-master multi-replica database. This is called the "High Availability" (HA) feature set. Berkeley DB's evolution has sometimes led to minor API changes or log format changes, but very rarely have database formats changed. Berkeley DB HA supports online upgrades from one version to the next by maintaining the ability to read and apply the prior release's log records. Starting with the 6.0.21 (Oracle 12c) release, all Berkeley DB products are licensed under the
GNU AGPL The GNU Affero General Public License (GNU AGPL) is a free, copyleft license published by the Free Software Foundation in November 2007, and based on the GNU GPL version 3 and the ''Affero General Public License'' (non-GNU). It is intended for ...
. Previously, Berkeley DB was redistributed under the 4-clause
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lic ...
(before version 2.0), and the Sleepycat Public License, which is an OSI-approved
open-source license Open-source licenses are software licenses that allow content to be used, modified, and shared. They facilitate free and open-source software (FOSS) development. Intellectual property (IP) laws restrict the modification and sharing of creative ...
as well as an FSF-approved
free software license A free-software license is a notice that grants the recipient of a piece of software extensive rights to modify and redistribute that software. These actions are usually prohibited by copyright law, but the rights-holder (usually the author) ...
. The product ships with complete source code, build script, test suite, and documentation. The comprehensive feature along with the licensing terms have led to its use in a multitude of
free and open-source software Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...
. Those who do not wish to abide by the terms of the GNU AGPL, or use an older version with the Sleepycat Public License, have the option of purchasing another proprietary license for redistribution from
Oracle Corporation Oracle Corporation is an American Multinational corporation, multinational computer technology company headquartered in Austin, Texas. Co-founded in 1977 in Santa Clara, California, by Larry Ellison, who remains executive chairman, Oracle was ...
. This technique is called dual licensing. Berkeley DB includes compatibility interfaces for some historic Unix database libraries: dbm, ndbm and hsearch (a
System V Unix System V (pronounced: "System Five") is one of the first commercial versions of the Unix operating system. It was originally developed by AT&T and first released in 1983. Four major versions of System V were released, numbered 1, 2, 3, an ...
and
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
library for creating in-memory
hash table In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
s).


Architecture

Berkeley DB has an architecture notably simpler than
relational database management system A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
s. Like SQLite and LMDB, it is not based on a server/client model, and does not provide support for network access programs access the database using in-process
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
calls. Oracle added support for SQL in 11g R2 release based on the popular SQLite API by including a version of SQLite in Berkeley DB (it uses Berkeley DB for storage). A program accessing the database is free to decide how the data is to be stored in a record. Berkeley DB puts no constraints on the record's data. The record and its key can both be up to four gigabytes long. Berkeley DB supports database features such as
ACID An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
transactions, fine-grained locking, hot
backup In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "wikt:back ...
s and replication.


Oracle Corporation use of name "Berkeley DB"

The name "Berkeley DB" is used by Oracle Corporation for three different products, only one of which is BDB: # Berkeley DB, the C database library that is the subject of this article # Berkeley DB Java Edition, a pure Java library whose design is modelled after the C library but is otherwise unrelated # Berkeley DB XML, a C++ program that supports
XQuery XQuery (XML Query) is a query language and functional programming language designed to query and transform collections of structured and unstructured data, primarily in the form of XML. It also supports text data and, through implementation-sp ...
, and which includes a legacy version of the C database library


Open-source programs still using Berkeley DB

BDB was once very widespread, but usage dropped steeply from 2013 (see licensing section). Notable software that still uses Berkeley DB for data storage include: * Bogofilter – A free/open-source
spam Spam most often refers to: * Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation * Spamming, unsolicited or undesired electronic messages ** Email spam, unsolicited, undesired, or illegal email messages ...
filter that saves its wordlists using Berkeley DB by default. * Citadel/UX – A
collaborative software Collaborative software or groupware is application software designed to help people working on a common task to attain their goals. One of the earliest definitions of groupware is "intentional group processes plus software to support them." Regar ...
(messaging and groupware) that is directly descended from the
Citadel A citadel is the most fortified area of a town or city. It may be a castle, fortress, or fortified center. The term is a diminutive of ''city'', meaning "little city", because it is a smaller part of the city of which it is the defensive core. ...
family of programs, which became popular in the 1980s and 1990s as a
bulletin board system A bulletin board system (BBS), also called a computer bulletin board service (CBBS), is a computer server running list of BBS software, software that allows users to connect to the system using a terminal program. Once logged in, the user perfor ...
platform. * Sendmail – A free/open-source MTA, first released in 1983 for Linux/Unix systems. *
Spamassassin Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It ...
– A free/open-source anti-spam application. Open-source operating systems and languages such as
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
and Python still support old BerkelyDB interfaces. The
FreeBSD FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
and
OpenBSD OpenBSD is a security-focused operating system, security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking NetBSD ...
operating systems ship Berkeley DB 1.8x to support the dbopen() operating system call used by password programs such as pwb_mkdb. Linux operating systems, including those based on Debian, and Fedora ship Berkeley DB 5.3 libraries.


Licensing

Berkeley DB V2.0 and higher is available under a dual license: # Oracle commercial license # Open source license #* Berkeley DB #** V2.0 - V6.0.19 is licensed under the Sleepycat License #** V6.0.20 and newer is licensed under the GNU AGPL v3. Switching the open source license in 2013 from th
Sleepycat license
to the AGPL had a major effect on open source software. Since BDB is a library, any application linking to it must be under an AGPL-compatible license. Many open source applications and all closed source applications would need to be relicensed to become AGPL-compatible, which was not acceptable to many developers and open source operating systems. By 2013 there were many alternatives to BDB, and Debian Linux was typical in their decision to completely phase out Berkeley DB, with a preference for the
Lightning Memory-Mapped Database Lightning Memory-Mapped Database (LMDB) is an embedded transactional database in the form of a key-value store. LMDB is written in C (programming language), C with #API and uses, API bindings for several programming languages. LMDB stores arbitra ...
(LMDB).


See also

* VSAM


References


External links


Oracle Berkeley DB



Oracle Berkeley DB Documentation



Licensing pitfalls for Oracle Technology Products

Oracle Licensing Knowledge Net

''The Berkeley DB Book'' by Himanshu Yadava
{{DEFAULTSORT:Berkeley Db Database engines Database-related software for Linux Embedded databases Free database management systems Free software programmed in C Key-value databases NoSQL Oracle software Structured storage Software using the GNU Affero General Public License