YugabyteDB
   HOME

TheInfoList



OR:

YugabyteDB is a high-performance transactional
distributed SQL A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including c ...
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
for cloud-native applications, developed by Yugabyte.


History

Yugabyte was founded by ex-
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dust ...
engineers Kannan Muthukkaruppan, Karthik Ranganathan, and Mikhail Bautin. At Facebook, they were part of the team that built and operated
Cassandra Cassandra or Kassandra (; Ancient Greek: Κασσάνδρα, , also , and sometimes referred to as Alexandra) in Greek mythology was a Trojan priestess dedicated to the god Apollo and fated by him to utter true prophecies but never to be belie ...
and HBase during a period of significant growth in workloads such as Facebook
Messenger ''MESSENGER'' was a NASA robotic space probe that orbited the planet Mercury between 2011 and 2015, studying Mercury's chemical composition, geology, and magnetic field. The name is a backronym for "Mercury Surface, Space Environment, Geochem ...
and Facebook's
Operational Data Store An operational data store (ODS) is used for operational reporting and as a source of data for the enterprise data warehouse (EDW). It is a complementary element to an EDW in a decision support environment, and is used for operational reporting, co ...
. The founders came together in February 2016 to build YugabyteDB, believing that the trends they experienced at Facebook –
microservices A microservice architecture – a variant of the service-oriented architecture structural style – is an architectural pattern that arranges an application as a collection of loosely-coupled, fine-grained services, communicating through lightwe ...
,
containerization Containerization is a system of intermodal freight transport using intermodal containers (also called shipping containers and ISO containers). Containerization is also referred as "Container Stuffing" or "Container Loading", which is the p ...
,
high availability High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. Modernization has resulted in an increased reliance on these systems. F ...
, geographic distribution, APIs, and
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
– were relevant to all businesses, especially as they move from on-premise to cloud-native operations. YugabyteDB was initially available in two editions: community and enterprise. In July 2019, Yugabyte open sourced previously commercial features and launched YugabyteDB as open-source under the
Apache The Apache () are a group of culturally related Native American tribes in the Southwestern United States, which include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Ndendahe (Bedonkohe or Mogollon and Nednhi or Carrizaleño a ...
2.0 license. The rapid evolution of the product led to being named as a 2020
Gartner Gartner, Inc is a technological research and consulting firm based in Stamford, Connecticut that conducts research on technology and shares this research both through private consulting as well as executive programs and conferences. Its client ...
Cool Vendor in Data Management. Yugabyte launched Yugabyte Cloud, now renamed YugabyteDB Managed, a fully managed database-as-a-service offering of YugabyteDB, in September 2021.


Funding

Six years after the company's inception, Yugabyte closed a $188 Million Series C funding round to become a
Unicorn The unicorn is a legendary creature that has been described since antiquity as a beast with a single large, pointed, spiraling horn projecting from its forehead. In European literature and art, the unicorn has for the last thousand years o ...
start-up with a valuation of $1.3Bn


Architecture

YugabyteDB is a distributed SQL database that aims to be strongly transactionally consistent across failure zones (i.e.
ACID In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a se ...
compliance]. Jepsen testing, the ''de facto'' industry standard for verifying correctness, has never fully passed, mainly due to race conditions during schema changes. In
CAP Theorem In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewe ...
terms YugabyteDB is a Consistent/Partition Tolerant (CP) database. YugabyteDB has two layers, a storage engine known as DocDB and the Yugabyte Query Layer.


DocDB

The storage engine consists of a customized
RocksDB RocksDB is a high performance embedded database for key-value data. It is a fork of Google's LevelDB optimized to exploit many CPU cores, and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound wo ...
combined with sharding and load balancing algorithms for the data. In addition, the Raft consensus algorithm controls the replication of data between the nodes. There is also a
Distributed transaction A distributed transaction is a database transaction in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that enc ...
manager and
Multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. Description ...
(MVCC) to support distributed transactions. The engine also exploits a Hybrid Logical Clock that combines coarsely-synchronized physical clocks with Lamport clocks to track causal relationships. The DocDB layer is not directly accessible by users.


YugabyteDB Query Layer

Yugabyte has a pluggable query layer that abstracts the query layer from the storage layer below. There are currently two APIs that can access the database: YSQL is a PostgreSQL code-compatible API based around v11.2. YSQL is accessed via standard PostgreSQL drivers using native protocols. It exploits the native PostgreSQL code for the query layer and replaces the storage engine with calls to the pluggable query layer. This re-use means that Yugabyte supports many features, including: * Triggers & Stored Procedures * PostgreSQL extensions that operate in the query layer * Native JSONB support YCQL is a Cassandra-like API based around v3.10 and re-written in C++. YCQL is accessed via standard Cassandra drivers using the native protocol port of 9042. In addition to the 'vanilla' Cassandra components, YCQL is augmented with the following features: * Transactional consistency - unlike Cassandra, Yugabyte YCQL is transactional. *
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other se ...
data types supported natively * Tables can have secondary indexes Currently, data written to either API is not accessible via the other API, however YSQL can access YCQL using the PostgreSQL foreign data wrapper feature. The security model for accessing the system is inherited from the API, so access controls for YSQL look like PostgreSQL, and YCQL looks like Cassandra access controls.


Cluster-to-cluster replication

In addition to its core functionality of distributing a single database, YugabyteDB has the ability to replicate between database instances. The replication can be one-way or bi-directional and is asynchronous. One-way replication is used either to create a read-only copy for workload off-loading or in a read-write mode to create an active-passive standby. Bi-directional replication is generally used in read-write configurations and is used for active-active configurations, geo-distributed applications, etc.


See also

*
Cloud database A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. There are two common deployment models: users can run databases on the cloud independently, using a virtual machin ...
*
Distributed SQL A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including c ...
*
Comparison of relational database management systems The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are ba ...
*
Comparison of object–relational database management systems This is a comparison of object–relational database management systems (ORDBMSs). Each system has at least some features of an object–relational database; they vary widely in their completeness and the approaches taken. The following tables c ...
*
Cloud native computing Cloud native computing is an approach in software development that utilizes cloud computing to "build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds". These technologies such as containers, ...
*
Database management system In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
*
List of databases using MVCC The following database management systems and other software use multiversion concurrency control. Databases * Altibase * Berkeley DB * Cloudant * Cloud Spanner * Clustrix * CockroachDB * Couchbase * CouchDB * CUBRID * IBM Db2 – since IBM D ...
* List of relational database management systems * CockroachDB * TiDB


References


External links

* * {{GitHub, yugabyte/yugabyte-db
Slack community
Database companies Databases Database-related software for Linux NewSQL Distributed computing Computer systems Software companies of the United States Companies based in Silicon Valley