Database Engine
   HOME

TheInfoList



OR:

A database engine (or storage engine) is the underlying software component that a database management system (DBMS) uses to
create, read, update and delete In computer programming, create, read, update, and delete (CRUD) are the four basic operations of persistent storage. CRUD is also sometimes used to describe user interface conventions that facilitate viewing, searching, and changing information u ...
(CRUD)
data In the pursuit of knowledge, data (; ) is a collection of discrete Value_(semiotics), values that convey information, describing quantity, qualitative property, quality, fact, statistics, other basic units of meaning, or simply sequences of sy ...
from a
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
. Most database management systems include their own
application programming interface An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how t ...
(API) that allows the user to interact with their underlying engine without going through the user interface of the DBMS. The term "database engine" is frequently used interchangeably with "
database server A database server is a server which uses a database application that provides database services to other computer programs or to computers, as defined by the client–server model. Database management systems (DBMSs) frequently provide database-s ...
" or "database management system". A "database instance" refers to the processes and memory structures of the running database engine.


Storage engines

Many of the modern DBMS support multiple storage engines within the same database. For example,
MySQL MySQL () is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database ...
supports
InnoDB InnoDB is a storage engine for the database management system MySQL and MariaDB. Since the release of MySQL 5.5.5 in 2010, it replaced MyISAM as MySQL's default table type. It provides the standard ACID-compliant transaction features, along ...
as well as
MyISAM MyISAM was the default storage engine for the MySQL relational database management system versions prior to 5.5 released in December 2009. It is based on the older ISAM code, but it has many useful extensions. Filesystem Each MyISAM table is ...
. Some storage engines are transactional. Additional engine types include: *
Embedded database An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes: * database systems with differing ...
engines *
In-memory database An in-memory database (IMDB, or main memory database system (MMDB) or memory resident database) is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that e ...
engines


Design considerations

Information in a database is stored as bits laid out as
data structures In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
in storage that can be efficiently read from and written to given the properties of hardware. Typically the storage itself is designed to meet requirements of various areas that extensively utilize storage, including databases. A
DBMS In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
in operation always simultaneously utilizes several storage types (e.g., memory, and external storage), with respective layout methods. In principle the database storage can be viewed as a
linear address space Flat memory model or linear memory model refers to a memory addressing paradigm in which "memory appears to the program as a single contiguous address space." The CPU can directly (and linearly) address all of the available memory locations witho ...
, where every bit of data has its unique address in this address space. In practice, only a very small percentage of addresses are kept as initial reference points (which also requires storage); most data is accessed by indirection using displacement calculations (distance in bits from the reference points) and data structures which define access paths (using pointers) to all needed data in an effective manner, optimized for the needed data access operations.


Database storage hierarchy

A database, while in operation, resides simultaneously in several types of storage, forming a
storage hierarchy In computer architecture, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlli ...
. By the nature of contemporary computers most of the database part inside a computer that hosts the DBMS resides (partially replicated) in volatile storage. Data (pieces of the database) that are being processed/manipulated reside inside a processor, possibly in processor's caches. These data are being read from/written to memory, typically through a computer
bus A bus (contracted from omnibus, with variants multibus, motorbus, autobus, etc.) is a road vehicle that carries significantly more passengers than an average car or van. It is most commonly used in public transport, but is also in use for cha ...
(so far typically volatile storage components). Computer memory is communicating data (transferred to/from) external storage, typically through standard storage interfaces or networks (e.g., fibre channel,
iSCSI Internet Small Computer Systems Interface or iSCSI ( ) is an Internet Protocol-based storage networking standard for linking data storage facilities. iSCSI provides block-level access to storage devices by carrying SCSI commands over a TCP/IP ...
). A storage array, a common external storage unit, typically has storage hierarchy of its own, from a fast cache, typically consisting of (volatile and fast) DRAM, which is connected (again via standard interfaces) to drives, possibly with different speeds, like
flash drives A flash drive is a portable computer drive that uses flash memory. Flash drives are the larger memory modules consisting of a number of flash chips. A flash chip is used to read the contents of a single cell, but it can write entire block of cell ...
and magnetic
disk drive Disk storage (also sometimes called drive storage) is a general category of storage mechanisms where data is recorded by various electronic, magnetic, optical, or mechanical changes to a surface layer of one or more rotating disks. A disk drive is ...
s (non-volatile). The drives may be connected to magnetic tapes, on which typically the least active parts of a large database may reside, or database backup generations. Typically a correlation exists currently between storage speed and price, while the faster storage is typically volatile.


Data structures

A data structure is an abstract construct that embeds data in a well defined manner. An efficient data structure allows manipulation of the data in efficient ways. The data manipulation may include data insertion, deletion, updating and retrieval in various modes. A certain data structure type may be very effective in certain operations, and very ineffective in others. A data structure type is selected upon DBMS development to best meet the operations needed for the types of data it contains. Type of data structure selected for a certain task typically also takes into consideration the type of storage it resides in (e.g., speed of access, minimal size of storage chunk accessed, etc.). In some DBMSs database administrators have the flexibility to select among options of data structures to contain user data for performance reasons. Sometimes the data structures have selectable parameters to tune the database performance. Databases may store data in many data structure types. Common examples are the following: *ordered/unordered flat files *
hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
s * B+ trees *
ISAM ISAM (an acronym for indexed sequential access method) is a method for creating, maintaining, and manipulating computer files of data so that records can be retrieved sequentially or randomly by one or more keys. Indexes of key fields are mainta ...
* heaps


Data orientation and clustering

In contrast to conventional row-orientation, relational databases can also be column-oriented or correlational in the way they store data in any particular structure. In general, substantial performance improvement is gained if different types of database objects that are usually utilized together are laid in storage in proximity, being "clustered". This usually allows to retrieve needed related objects from storage in minimum number of input operations (each sometimes substantially time-consuming). Even for in-memory databases clustering provides performance advantage due to common utilization of large caches for input-output operations in memory, with similar resulting behavior. For example, it may be beneficial to cluster a record of an "item" in stock with all its respective "order" records. The decision of whether to cluster certain objects or not depends on the objects' utilization statistics, object sizes, caches sizes, storage types, etc.


Database indexing

Indexing is a technique some storage engines use for improving database performance. The many types of indexes share the common property that they reduce the need to examine every entry when running a query. In large databases, this can reduce query time/cost by orders of magnitude. The simplest form of index is a sorted list of values that can be searched using a
binary search In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
with an adjacent reference to the location of the entry, analogous to the index in the back of a book. The same data can have multiple indexes (an employee database could be indexed by last name and hire date). Indexes affect performance, but not results. Database designers can add or remove indexes without changing application logic, reducing maintenance costs as the database grows and database usage evolves. Indexes can speed up data access, but they consume space in the database, and must be updated each time the data is altered. Indexes therefore can speed data access but slow data maintenance. These two properties determine whether a given index is worth the cost.


References


External links

*https://web.archive.org/web/20100330045149/http://dev.mysql.com/tech-resources/articles/storage-engine/part_3.html
MySQL Administrator's Bible
Chapter 11 "Storage Engines" {{DEFAULTSORT:Database Engine Data management Database management systems