HOME

TheInfoList



OR:

MongoDB is a
source-available Source-available software is software released through a source code distribution model that includes arrangements where the source can be viewed, and in some cases modified, but without necessarily meeting the criteria to be called open-source ...
cross-platform In computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several computing platforms. Some cross-platform software r ...
document-oriented database A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving and managing document-oriented information, also known as semi-structured data. Document-oriented databases are one ...
program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by
MongoDB Inc. MongoDB, Inc. is an American software company that develops and provides commercial support for the source-available database MongoDB, a NoSQL database that stores data in JSON-like documents with flexible schemas. History The company was first ...
and licensed under the
Server Side Public License The Server Side Public License (SSPL) is a source-available software license introduced by MongoDB Inc. in 2018. It includes most of the text and provisions of the GNU Affero General Public License version 3 (AGPL v3), and primarily replaces se ...
(SSPL) which is deemed non-free by several distributions.


History

10gen software company began developing MongoDB in 2007 as a component of a planned
platform as a service Platform as a service (PaaS) or application platform as a service (aPaaS) or platform-based service is a category of cloud computing services that allows customers to provision, instantiate, run, and manage a modular bundle comprising a computin ...
product. In 2009, the company shifted to an open-source development model, with the company offering commercial support and other services. In 2013, 10gen changed its name to MongoDB Inc. On October 20, 2017, MongoDB became a publicly traded company, listed on NASDAQ as MDB with an IPO price of $24 per share. MongoDB is a global company with US headquarters in New York City, USA and International headquarters in Dublin, Ireland. On October 30, 2019, MongoDB teamed up with
Alibaba Cloud Alibaba Cloud, also known as Aliyun (), is a cloud computing company, a subsidiary of Alibaba Group. Alibaba Cloud provides cloud computing services to online businesses and Alibaba's own e-commerce ecosystem. Its international operations are re ...
, who will offer its customers a MongoDB-as-a-service solution. Customers can use the managed offering from BABA's global data centers.


Main features


Ad-hoc queries

MongoDB supports field,
range query A range query is a common database operation that retrieves all records where some value is between an upper and lower boundary. For example, list all employees with 3 to 5 years' experience. Range queries are unusual because it is not generally ...
, and regular-expression searches. Queries can return specific fields of documents and also include user-defined
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
functions. Queries can also be configured to return a random sample of results of a given size.


Indexing

Fields in a MongoDB document can be indexed with primary and secondary indices or index.


Replication

MongoDB provides high availability with replica sets. A replica set consists of two or more copies of the data. Each replica-set member may act in the role of primary or secondary replica at any time. All writes and reads are done on the primary replica by default. Secondary replicas maintain a copy of the data of the primary using built-in replication. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary. Secondaries can optionally serve read operations, but that data is only eventually consistent by default. If the replicated MongoDB deployment only has a single secondary member, a separate
daemon Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and Greek mythology, mythology and of later Hellenistic religion and Hell ...
called an ''arbiter'' must be added to the set. It has a single responsibility, which is to resolve the election of the new primary. As a consequence, an idealized distributed MongoDB deployment requires at least three separate servers, even in the case of just one primary and one secondary.


Load balancing

MongoDB scales horizontally using
sharding A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load. Some data within a database remains present in all shards, but so ...
. The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more replicas.) Alternatively, the shard key can be hashed to map to a shard – enabling an even data distribution. MongoDB can run over multiple servers, balancing the load or duplicating data to keep the system up and running in case of hardware failure.


File storage

MongoDB can be used as a
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
, called GridFS, with load balancing and data replication features over multiple machines for storing files. This function, called
grid file system A grid file system is a computer file system whose goal is improved reliability and availability by taking advantage of many smaller file storage areas. Components File systems contain up to three components: * File table (FAT table, MFT, etc.) * ...
, is included with MongoDB drivers. MongoDB exposes functions for file manipulation and content to developers. GridFS can be accessed using mongofiles utility or plugins for
Nginx Nginx (pronounced "engine x" ) is a web server that can also be used as a reverse proxy, load balancer, mail proxy and HTTP cache. The software was created by Igor Sysoev and publicly released in 2004. Nginx is free and open-source software ...
and
lighttpd lighttpd (pronounced "lighty") is an open-source web server optimized for speed-critical environments while remaining standards-compliant, secure and flexible. It was originally written by Jan Kneschke as a proof-of-concept of the c10k problem â ...
. GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.


Aggregation

MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single-purpose aggregation methods.
Map-reduce MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a ''map'' procedure, which performs filtering ...
can be used for batch processing of data and aggregation operations. But according to MongoDB's documentation, the Aggregation Pipeline provides better performance for most aggregation operations. The aggregation framework enables users to obtain the kind of results for which the SQL GROUP BY clause is used. Aggregation operators can be strung together to form a pipeline – analogous to
Unix pipes In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (''stdo ...
. The aggregation framework includes the $lookup operator which can join documents from multiple collections, as well as statistical operators such as standard deviation.


Server-side JavaScript execution

JavaScript can be used in queries, aggregation functions (such as
MapReduce MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a ''map'' procedure, which performs filtering ...
), and sent directly to the database to be executed.


Capped collections

MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like a circular queue.


Transactions

MongoDB claims to support multi-document ACID transactions since the 4.0 release in June 2018. This claim was found to not be true as MongoDB violates
snapshot isolation In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at ...
.


Editions


MongoDB Community Server

The MongoDB Community Edition is free and available for Windows, Linux, and macOS.


MongoDB Enterprise Server

MongoDB Enterprise Server is the commercial edition of MongoDB, available as part of the MongoDB Enterprise Advanced subscription.


MongoDB Atlas

MongoDB is also available as an on-demand fully managed service. MongoDB Atlas runs on AWS, Microsoft Azure, and Google Cloud Platform. On March 10, 2022, MongoDB warned its users in Russia and Belarus that their data stored on the MongoDB Atlas platform will be destroyed.


Architecture


Programming language accessibility

MongoDB has official drivers for major programming languages and development environments. There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.


Serverless access


Management and graphical front-ends

The primary interface to the database has been the mongo shell. Since MongoDB 3.2, MongoDB Compass is introduced as the native GUI. There are products and third-party projects that offer user interfaces for administration and data viewing.


Licensing


MongoDB Community Server

As of October 2018, MongoDB is released under the
Server Side Public License The Server Side Public License (SSPL) is a source-available software license introduced by MongoDB Inc. in 2018. It includes most of the text and provisions of the GNU Affero General Public License version 3 (AGPL v3), and primarily replaces se ...
(SSPL), a non-free license developed by the project. It replaces the
GNU Affero General Public License The GNU Affero General Public License (GNU AGPL) is a free, copyleft license published by the Free Software Foundation in November 2007, and based on the GNU General Public License, version 3 and the Affero General Public License. The Free So ...
, and is nearly identical to the
GNU General Public License version 3 The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general u ...
, but requires that those making the software publicly available as part of a "service" must make the service's entire source code (insofar that a user would be able to recreate the service themselves) available under this license. By contrast, the AGPL only requires the source code of the licensed software to be provided to users when the software is conveyed over a network. The SSPL was submitted for certification to the
Open Source Initiative The Open Source Initiative (OSI) is the steward of the Open Source Definition, the set of rules that define open source software. It is a California public-benefit nonprofit corporation,_with_501(c)(3).html" ;"title="110. - 6910./ref> is a type o ...
but later withdrawn. In January 2021, the Open Source Initiative stated that SSPL is not an open source license. The language drivers are available under an Apache License. In addition, MongoDB Inc. offers proprietary licenses for MongoDB. The last versions licensed as AGPL version 3 are 4.0.3 (stable) and 4.1.4. MongoDB has been removed from the Debian, Fedora and
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial open-source Linux distribution developed by Red Hat for the commercial market. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop ...
distributions due to the licensing change. Fedora determined that the SSPL version 1 is not a free software license because it is "intentionally crafted to be aggressively discriminatory" towards commercial users.


Bug reports and criticisms


Security

Due to the default security configuration of MongoDB, allowing anyone to have full access to the database, data from tens of thousands of MongoDB installations has been stolen. Furthermore, many MongoDB servers have been held for ransom. In September 2017; updated January 2018, in an official response Davi Ottenheimer, lead Product Security at MongoDB, proclaimed that measures have been taken by MongoDB to defend against these risks. From the MongoDB 2.6 release onwards, the binaries from the official MongoDB RPM and DEB packages bind to
localhost In computer networking, localhost is a hostname that refers to the current device used to access it. It is used to access the network services that are running on the host via the loopback network interface. Using the loopback interface bypasses a ...
by default. From MongoDB 3.6, this default behavior was extended to all MongoDB packages across all platforms. As a result, all networked connections to the database will be denied unless explicitly configured by an administrator.


Technical criticisms

In some failure scenarios where an application can access two distinct MongoDB processes, but these processes cannot access each other, it is possible for MongoDB to return stale reads. In this scenario it is also possible for MongoDB to roll back writes that have been acknowledged. This issue was addressed since version 3.4.0 released in November 2016 (and back-ported to v3.2.12). Before version 2.2,
locks Lock(s) may refer to: Common meanings *Lock and key, a mechanical device used to secure items of importance *Lock (water navigation), a device for boats to transit between different levels of water, as in a canal Arts and entertainment * ''Lock ...
were implemented on a per-server process basis. With version 2.2, locks were implemented at the database level. Since version 3.0, pluggable storage engines were introduced, and each storage engine may implement locks differently. With MongoDB 3.0 locks are implemented at the collection level for the MMAPv1 storage engine, while the
WiredTiger WiredTiger is a NoSQL, Open Source extensible platform for data management. It is released under version 2 or 3 of the GNU General Public License. WiredTiger uses MultiVersion Concurrency Control ( MVCC) architecture. MongoDB acquired WiredTiger ...
storage engine uses an optimistic concurrency protocol that effectively provides document-level locking. Even with versions prior to 3.0, one approach to increase concurrency is to use
sharding A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load. Some data within a database remains present in all shards, but so ...
. In some situations, reads and writes will yield their locks. If MongoDB predicts a page is unlikely to be in memory, operations will yield their lock while the pages load. The use of lock yielding expanded greatly in 2.2. Up until version 3.3.11, MongoDB could not do
collation Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office filin ...
-based sorting and was limited to byte-wise comparison via memcmp which would not provide correct ordering for many non-English languages when used with a Unicode encoding. The issue was fixed on August 23, 2016. Prior to MongoDB 4.0, queries against an index were not atomic. Documents which were being updated while the query was running could be missed. The introduction of the snapshot read concern in MongoDB 4.0 eliminated this phenomenon. Although MongoDB claims in an undated article entitled "MongoDB and Jepsen" that their database passed Distributed Systems Safety Research company Jepsen's tests, which it called “the industry’s toughest data safety, correctness, and consistency Tests”, Jepsen published an article in May 2020 stating that MongoDB 3.6.4 had in fact failed their tests, and that the newer MongoDB 4.2.6 has more problems including “retrocausal transactions” where a transaction reverses order so that a read can see the result of a future write. Jepsen noted in their report that MongoDB omitted any mention of these findings on MongoDB's "MongoDB and Jepsen" page.


MongoDB Conference

MongoDB Inc. hosts an annual developer conference which has been referred to as either MongoDB World or MongoDB.live.


See also

*
Apache Cassandra Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassand ...
*
BSON BSON () is a computer data interchange format. The name "BSON" is based on the term JSON and stands for "Binary JSON". It is a binary form for representing simple or complex data structures including associative arrays (also known as name-value ...
, the binary JSON format MongoDB uses for data storage and transfer * List of server-side JavaScript implementations *
MEAN There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the '' ari ...
, a
solutions stack In computing, a solution stack or software stack is a set of software subsystems or components needed to create a complete Computing platform, platform such that no additional software is needed to support applications. Applications are said to " ...
using MongoDB as the database *
Server-side scripting Server-side scripting is a technique used in web development which involves employing scripts on a web server which produces a response customized for each user's (client's) request to the website. The alternative is for the web server itself ...
*
TokuMX TokuMX is an open-source software, open-source distribution of MongoDB which, among other things, replaces the default B-tree data structure found in the basic MongoDB distribution with a fractal tree index. It is a drop-in replacement for MongoD ...
, a fork of MongoDB with stronger consistency and new index structures * Amazon DocumentDB, a proprietary database service designed for MongoDB compatibility


References


Bibliography

* * * *


External links

* {{Authority control 2009 software Database-related software for Linux Distributed computing architecture Document-oriented databases NoSQL Structured storage