YDB (database)
   HOME

TheInfoList



OR:

YDB (Yet another DataBase) is a
distributed SQL A distributed SQL database is a single relational database which replicates data across multiple servers. Distributed SQL databases are strongly consistent and most support consistency across racks, data centers, and wide area networks including c ...
database management system In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and an ...
(DBMS) developed by
Yandex Yandex LLC ( rus, Яндекс, r=Yandeks, p=ˈjandəks) is a Russian technology company that provides Internet-related products and services including a web browser, search engine, cloud computing, web mapping, online food ordering, streaming ...
, available as
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
technology.


Functionality

YDB is a technology that allows creating large
web service A web service (WS) is either: * a service offered by an electronic device to another electronic device, communicating with each other via the Internet, or * a server running on a computer device, listening for requests at a particular port over a n ...
s capable of supporting large operational loads of up to millions requests per second. It uses a strongly typed
dialect A dialect is a Variety (linguistics), variety of language spoken by a particular group of people. This may include dominant and standard language, standardized varieties as well as Vernacular language, vernacular, unwritten, or non-standardize ...
of
SQL Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel") is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
— YDB Query Language (YQL) as a default query language and supports
ACID An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
transactions. The closest analogues of this DBMS available as open-source software are
YugabyteDB YugabyteDB is a high-performance transactional distributed SQL database for cloud-native applications, developed by Yugabyte. History Yugabyte was founded by ex-Facebook engineers Kannan Muthukkaruppan, Karthik Ranganathan, and Mikhail Bau ...
and
CockroachDB CockroachDB is a source-available distributed SQL database management system developed by Cockroach Labs. The relational functionality is built on top of a distributed, transactional, consistent key-value store that can survive a variety of d ...
. YDB can be either self-deployed to
computer cluster A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newes ...
s across physical
hosts A host is a person responsible for guests at an event or for providing hospitality during it. Host may also refer to: Places * Host, Pennsylvania, a village in Berks County * Host Island, in the Wilhelm Archipelago, Antarctica People * ...
or on
virtual machine In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
s via
Kubernetes Kubernetes (), also known as K8s is an open-source software, open-source OS-level virtualization, container orchestration (computing), orchestration system for automating software deployment, scaling, and management. Originally designed by Googl ...
or as a managed service in Yandex Cloud.
Serverless computing Serverless computing is "a cloud service category in which the customer can use different cloud capability types without the customer having to provision, deploy and manage either hardware or software resources, other than providing customer appli ...
mode or dedicated mode are available for the managed service option.


Architecture

YDB works on clusters with
shared-nothing architecture A shared-nothing architecture (SN) is a distributed computing architecture in which each update request is satisfied by a single node (processor/memory/storage unit) in a computer cluster. The intent is to eliminate contention among nodes. Nodes do ...
and uses standard commodity hardware. The system is based on tablets which implement a
communication protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any variation of a physical quantity. The protocol defines the rules, syntax, semantics (computer science), sem ...
for solving consensus in a network of unreliable processors. Functionally, this protocol is similar to Paxos and
Raft A raft is any flat structure for support or transportation over water. It is usually of basic design, characterized by the absence of a hull. Rafts are usually kept afloat by using any combination of buoyant materials such as wood, sealed barre ...
. User tablets in YDB have a mandatory primary key and are sharded by its ranges. Shards with user data are controlled by tablets, called DataShards. The size of a DataShard can reach several gigabytes. It can automatically split into multiple tablets when data storage threshold or shard load threshold is exceeded. This is how the system scales transparently based on the user load. In addition to DataShard, other tablet types include, among others: * SchemeShard, which stores metadata about user tables; * Hive, which balances and launches tablets; * Coordinator and Mediator, which schedule distributed transactions. Data from tablets is stored in the Distributed Storage layer which is a key-value storage with a specialized protocol to support the tablet protocol. Distributed Storage ensures data replication, while data from tablets is stored as BLOBs. YDB executes distributed transactions between data from one or more tables using a distributed transaction framework based on the Calvin algorithm. Unlike Calvin, YDB supports interactive and non-deterministic transactions by using record locking. YDB is based on the
actor model The actor model in computer science is a mathematical model of concurrent computation that treats an ''actor'' as the basic building block of concurrent computation. In response to a message it receives, an actor can: make local decisions, create ...
. Actors are single-threaded back-end automats that exchange messages with each other while residing on different cluster servers. Messages within the network are exchanged using the interconnect library developed as part of the project. A number of digital services, such as virtual block devices or persistent queues, have been developed as a layer over YDB. YDB supports user interaction via the
gRPC gRPC (acronym for gRPC Remote Procedure Calls) is a cross-platform high-performance remote procedure call (RPC) framework. gRPC was initially created by Google, but is open source and is used in many organizations. Use cases range from microservi ...
protocol with several client SDKs implementing procedures for node discovery, client balancing, etc. YDB does not support
UUID A Universally Unique Identifier (UUID) is a 128-bit nominal number, label used to uniquely identify objects in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems. When generated according to the ...
as standalone data type. It doesn't have a built-in function to automatically increment field value when adding data to a table.


History

In 2010, Yandex started working on its own
NoSQL NoSQL (originally meaning "Not only SQL" or "non-relational") refers to a type of database design that stores and retrieves data differently from the traditional table-based structure of relational databases. Unlike relational databases, which ...
DBMS KiWi and rolled it out for internal use in 2011. However, KiWi had
eventual consistency Eventual consistency is a consistency model used in distributed computing to achieve high availability. Put simply: if no new updates are made to a given data item, ''eventually'' all accesses to that item will return the last updated value. Eve ...
, as well as other disadvantages of the NoSQL model. In 2012, to cover its needs for DBMS, Yandex starts the KiKiMR project, which later becomes known as YDB. In 2016, YDB was rolled out to Yandex services. In 2018, the Yandex Cloud platform was launched with data storage based on YDB. At the same time, the company announced that in the future it would make YDB available as a managed service in Yandex Cloud, and later provided customers with access to this service, as well as other
managed services Managed services is the practice of outsourcing the responsibility for maintaining, and anticipating need for, a range of processes and functions, ostensibly for the purpose of improved operations and reduced budgetary expenditures through the ...
, such as PostgreSQL, MongoDB and others. This cloud version was called Yandex Database (Managed service for YDB, later). In April 2022, the YDB DBMS was published on
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
as free software under the Apache 2.0 License.


References


External links

*{{official, https://ydb.tech Relational database management systems Yandex software Free database management systems Cross-platform free software