YugabyteDB
   HOME

TheInfoList



OR:

YugabyteDB is a high-performance transactional
distributed SQL A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including c ...
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
for cloud-native applications, developed by Yugabyte.


History

Yugabyte was founded by ex-
Facebook Facebook is an online social media and social networking service owned by American company Meta Platforms. Founded in 2004 by Mark Zuckerberg with fellow Harvard College students and roommates Eduardo Saverin, Andrew McCollum, Dustin M ...
engineers Kannan Muthukkaruppan, Karthik Ranganathan, and Mikhail Bautin. At Facebook, they were part of the team that built and operated
Cassandra Cassandra or Kassandra (; Ancient Greek: Κασσάνδρα, , also , and sometimes referred to as Alexandra) in Greek mythology was a Trojan priestess dedicated to the god Apollo and fated by him to utter true prophecies but never to be believe ...
and
HBase HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File Sys ...
during a period of significant growth in workloads such as Facebook
Messenger ''MESSENGER'' was a NASA robotic space probe that orbited the planet Mercury between 2011 and 2015, studying Mercury's chemical composition, geology, and magnetic field. The name is a backronym for "Mercury Surface, Space Environment, Geoche ...
and Facebook's
Operational Data Store An operational data store (ODS) is used for operational reporting and as a source of data for the enterprise data warehouse (EDW). It is a complementary element to an EDW in a decision support environment, and is used for operational reporting, con ...
. The founders came together in February 2016 to build YugabyteDB, believing that the trends they experienced at Facebook –
microservices A microservice architecture – a variant of the service-oriented architecture structural style – is an architectural pattern that arranges an application as a collection of loosely-coupled, fine-grained services, communicating through lightw ...
,
containerization Containerization is a system of intermodal freight transport using intermodal containers (also called shipping containers and ISO containers). Containerization is also referred as "Container Stuffing" or "Container Loading", which is the pro ...
,
high availability High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. Modernization has resulted in an increased reliance on these systems. Fo ...
, geographic distribution,
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standa ...
s, and
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
– were relevant to all businesses, especially as they move from on-premise to cloud-native operations. YugabyteDB was initially available in two editions: community and enterprise. In July 2019, Yugabyte open sourced previously commercial features and launched YugabyteDB as open-source under the
Apache The Apache () are a group of culturally related Native American tribes in the Southwestern United States, which include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Ndendahe (Bedonkohe or Mogollon and Nednhi or Carrizaleño an ...
2.0 license. The rapid evolution of the product led to being named as a 2020
Gartner Gartner, Inc is a technological research and consulting firm based in Stamford, Connecticut that conducts research on technology and shares this research both through private consulting as well as executive programs and conferences. Its clients ...
Cool Vendor in Data Management. Yugabyte launched Yugabyte Cloud, now renamed YugabyteDB Managed, a fully managed database-as-a-service offering of YugabyteDB, in September 2021.


Funding

Six years after the company's inception, Yugabyte closed a $188 Million Series C funding round to become a
Unicorn The unicorn is a legendary creature that has been described since antiquity as a beast with a single large, pointed, spiraling horn projecting from its forehead. In European literature and art, the unicorn has for the last thousand years o ...
start-up with a valuation of $1.3Bn


Architecture

YugabyteDB is a distributed SQL database that aims to be strongly transactionally consistent across failure zones (i.e.
ACID In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequ ...
compliance]. Jepsen testing, the ''de facto'' industry standard for verifying correctness, has never fully passed, mainly due to race conditions during schema changes. In
CAP Theorem In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewer' ...
terms YugabyteDB is a Consistent/Partition Tolerant (CP) database. YugabyteDB has two layers, a storage engine known as DocDB and the Yugabyte Query Layer.


DocDB

The storage engine consists of a customized
RocksDB RocksDB is a high performance embedded database for Key-value database, key-value data. It is a fork of Google's LevelDB optimized to exploit many multi-core processor, CPU cores, and make efficient use of fast storage, such as solid-state drives ...
combined with sharding and load balancing algorithms for the data. In addition, the
Raft consensus algorithm Raft is a consensus algorithm designed as an alternative to the Paxos family of algorithms. It was meant to be more understandable than Paxos by means of separation of logic, but it is also formally proven safe and offers some additional featu ...
controls the replication of data between the nodes. There is also a
Distributed transaction A distributed transaction is a database transaction in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that enc ...
manager and
Multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. Description W ...
(MVCC) to support distributed transactions. The engine also exploits a Hybrid Logical Clock that combines coarsely-synchronized physical clocks with Lamport clocks to track causal relationships. The DocDB layer is not directly accessible by users.


YugabyteDB Query Layer

Yugabyte has a pluggable query layer that abstracts the query layer from the storage layer below. There are currently two APIs that can access the database: YSQL is a PostgreSQL code-compatible API based around v11.2. YSQL is accessed via standard PostgreSQL drivers using native protocols. It exploits the native PostgreSQL code for the query layer and replaces the storage engine with calls to the pluggable query layer. This re-use means that Yugabyte supports many features, including: * Triggers & Stored Procedures * PostgreSQL extensions that operate in the query layer * Native JSONB support YCQL is a Cassandra-like API based around v3.10 and re-written in C++. YCQL is accessed via standard Cassandra drivers using the native protocol port of 9042. In addition to the 'vanilla' Cassandra components, YCQL is augmented with the following features: * Transactional consistency - unlike Cassandra, Yugabyte YCQL is transactional. *
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
data types supported natively * Tables can have secondary indexes Currently, data written to either API is not accessible via the other API, however YSQL can access YCQL using the PostgreSQL foreign data wrapper feature. The security model for accessing the system is inherited from the API, so access controls for YSQL look like PostgreSQL, and YCQL looks like Cassandra access controls.


Cluster-to-cluster replication

In addition to its core functionality of distributing a single database, YugabyteDB has the ability to replicate between database instances. The replication can be one-way or bi-directional and is asynchronous. One-way replication is used either to create a read-only copy for workload off-loading or in a read-write mode to create an active-passive standby. Bi-directional replication is generally used in read-write configurations and is used for active-active configurations, geo-distributed applications, etc.


See also

*
Cloud database A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. There are two common deployment models: users can run databases on the cloud independently, using a virtual machin ...
*
Distributed SQL A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including c ...
*
Comparison of relational database management systems The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are b ...
* Comparison of object–relational database management systems *
Cloud native computing Cloud native computing is an approach in software development that utilizes cloud computing to "build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds". These technologies such as containers, m ...
*
Database management system In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...
*
List of databases using MVCC The following database management systems and other software use multiversion concurrency control. Databases * Altibase * Berkeley DB * Cloudant * Cloud Spanner * Clustrix * CockroachDB * Couchbase * CouchDB * CUBRID * IBM Db2 – since IBM D ...
*
List of relational database management systems This is a list of relational database management systems. List of software * 4th Dimension *Access Database Engine (formerly known as Jet Database Engine) *Adabas D *Airtable *Apache Derby *Apache Ignite * Aster Data *Amazon Aurora *Altibase * CA ...
*
CockroachDB CockroachDB is a commercial distributed SQL database management system, developed by Cockroach Labs. History Cockroach Labs was founded in 2015 by ex-Google employees Spencer Kimball, Peter Mattis, and Ben Darnell. Cockroach Labs founders Kim ...
*
TiDB TiDB (/’taɪdiːbi:/, "Ti" stands for Titanium) is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing ( HTAP) workloads. It is MySQL compatible and can provide horizontal scalability, strong consistenc ...


References


External links

* * {{GitHub, yugabyte/yugabyte-db
Slack community
Database companies Databases Database-related software for Linux NewSQL Distributed computing
Computer systems A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
Software companies of the United States Companies based in Silicon Valley