Bcache
   HOME

TheInfoList



OR:

bcache (abbreviated from ''block cache'') is a
cache Cache, caching, or caché may refer to: Places United States * Cache, Idaho, an unincorporated community * Cache, Illinois, an unincorporated community * Cache, Oklahoma, a city in Comanche County * Cache, Utah, Cache County, Utah * Cache County ...
in the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
's block layer, which is used for accessing
secondary storage Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processing unit (CPU) of a computer ...
devices. It allows one or more fast storage devices, such as flash-based
solid-state drive A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is ...
s (SSDs), to act as a cache for one or more slower storage devices, such as
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnet ...
s (HDDs); this effectively creates
hybrid volume In computing, a hybrid drive (solid state hybrid drive – SSHD) is a logical or physical storage device that combines a faster storage medium such as solid-state drive (SSD) with a higher-capacity hard disk drive (HDD). The intent is adding s ...
s and provides performance improvements. Designed around the nature and performance characteristics of SSDs, bcache also minimizes
write amplification Write amplification (WA) is an undesirable phenomenon associated with flash memory and solid-state drives (SSDs) where the actual amount of information physically written to the storage media is a multiple of the logical amount intended to be wr ...
by avoiding random writes and turning them into sequential writes instead. This merging of I/O operations is performed for both the cache and the primary storage, helping in extending the lifetime of flash-based devices used as caches, and in improving the performance of write-sensitive primary storages, such as
RAID 5 In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create ...
sets. bcache is licensed under the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the Four Freedoms (Free software), four freedoms to run, study, share, and modify the software. The license was th ...
(GPL), and Kent Overstreet is its primary developer. Overstreet considers bcache as a "prototype" for the development of
bcachefs Bcachefs is a copy-on-write (COW) file system for Linux-based operating systems. Its primary developer, Kent Overstreet, first announced it in 2015, and efforts are ongoing to have it included in the mainline Linux kernel. It is intended to compete ...
, a filesystem with significant improvements.


Overview

Using bcache makes it possible to have SSDs as another level of indirection within the data storage access paths, resulting in improved overall performance by using fast
flash Flash, flashes, or FLASH may refer to: Arts, entertainment, and media Fictional aliases * Flash (DC Comics character), several DC Comics superheroes with super speed: ** Flash (Barry Allen) ** Flash (Jay Garrick) ** Wally West, the first Kid ...
-based SSDs as caches for slower mechanical hard disk drives (HDDs) with rotational
magnetic media Magnetic storage or magnetic recording is the storage of data on a magnetized medium. Magnetic storage uses different patterns of magnetisation in a magnetizable material to store data and is a form of non-volatile memory. The information is ac ...
. That way, the gap between SSDs and HDDs can be bridged the costly speed of SSDs gets combined with the cheap storage capacity of traditional HDDs. Caching is implemented by using SSDs for storing data associated with performed
random read Random access (more precisely and more generally called direct access) is the ability to access an arbitrary element of a sequence in equal time or any datum from a population of addressable elements roughly as easily and efficiently as any othe ...
s and random writes, using near-zero
seek time Higher performance in hard disk drives comes from devices which have better performance characteristics. These performance characteristics can be grouped into two categories: access time and data transfer time (or rate). Access time The ''access ...
s as the most prominent feature of SSDs.
Sequential I/O Sequential access is a term describing a group of elements (such as data in a memory array or a disk file or on magnetic tape data storage) being accessed in a predetermined, ordered sequence. It is the opposite of random access, the ability to a ...
is not cached, to avoid rapid SSD
cache invalidation Cache invalidation is a process in a computer system whereby entries in a cache are replaced or removed. It can be done explicitly, as part of a cache coherence protocol. In such a case, a processor changes a memory location and then invalidates ...
on such operations that are already suitable enough for HDDs; going around the cache for big sequential writes is known as the write-around policy. Not caching the sequential I/O also helps in extending the lifetime of SSDs used as caches.
Write amplification Write amplification (WA) is an undesirable phenomenon associated with flash memory and solid-state drives (SSDs) where the actual amount of information physically written to the storage media is a multiple of the logical amount intended to be wr ...
is avoided by not performing random writes to SSDs; instead, all random writes to SSD caches are always combined into block-level writes, ending up with rewriting only the complete
erase block Wear leveling (also written as wear levelling) is a technique Wear leveling techniques for flash memory systems. for prolonging the service life of some kinds of erasable computer storage media, such as flash memory, which is used in solid-state dri ...
s on SSDs. Both ''
write-back In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhe ...
'' and ''
write-through In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
'' (which is the default) policies are supported for caching write operations. In case of the write-back policy, written data is stored inside the SSD caches first, and propagated to the HDDs later in a batched way while performing seek-friendly operations making bcache to act also as an I/O scheduler. For the write-through policy, which ensures that no write operation is marked as finished until the data requested to be written has reached both SSDs and HDDs, performance improvements are reduced by effectively performing only caching of the written data. Write-back policy with batched writes to HDDs provides additional benefits to write-sensitive
redundant array of independent disks Raid, RAID or Raids may refer to: Attack * Raid (military), a sudden attack behind the enemy's lines without the intention of holding ground * Corporate raid, a type of hostile takeover in business * Panty raid, a prankish raid by male college ...
(RAID) layouts such as
RAID 5 In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create ...
and
RAID 6 In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create ...
, which perform actual write operations as atomic read-modify-write sequences. That way, performance penalties of small random writes are reduced or avoided for such RAID layouts, by grouping them together and performing as batched sequential writes. Caching performed by bcache operates at the
block device In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an ...
level, making itself
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
–agnostic as long as the file system provides an embedded
universally unique identifier A universally unique identifier (UUID) is a 128-bit nominal number, label used for information in computer systems. The term globally unique identifier (GUID) is also used. When generated according to the standard methods, UUIDs are, for practic ...
(UUID); this requirement is satisfied by virtually all standard Linux file systems, as well as by swap partitions. Sizes of the logical blocks used internally by bcache as caching extents can go down to the size of a single HDD sector.


History

bcache was first announced by Kent Overstreet in July 2010, as a completely working Linux kernel module, though at its early beta stage. The development continued for almost two years, until May 2012, at which point bcache reached its production-ready state. It was merged into the
Linux kernel mainline The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU o ...
in kernel version 3.10, released on June 30, 2013. Overstreet has since been developing the
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
bcachefs Bcachefs is a copy-on-write (COW) file system for Linux-based operating systems. Its primary developer, Kent Overstreet, first announced it in 2015, and efforts are ongoing to have it included in the mainline Linux kernel. It is intended to compete ...
, based on ideas first developed in bcache that he said began "evolving ... into a full blown, general-purpose
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interf ...
filesystem". He describes bcache as a "prototype" for the ideas that became bcachefs and intends bcachefs to replace bcache. He officially announced bcachefs in 2015, and as of 2018 has been submitting it for consideration for inclusion in the mainline Linux kernel.


Features

As of version 3.10 of the Linux kernel, the following features are provided by bcache: * The same cache device can be used for caching an arbitrary number of the primary storage devices * Runtime attaching and detaching of primary storage devices from their caches, while mounted and in use (running in passthrough mode when not cached) * Automated recovery from unclean shutdowns writes are not completed until the cache is consistent with respect to the primary storage device; internally, bcache makes no distinction between clean and unclean shutdowns * Transparent handling of I/O errors generated by the cache devices *
Write barrier In operating systems, write barrier is a mechanism for enforcing a particular ordering in a sequence of writes to a storage system in a computer system. For example, a write barrier in a file system is a mechanism (program logic) that ensures that ...
s and associated cache flushes are properly handled * Write-through (which is the default), write-back and write-around policies * Sequential I/O is detected and bypassed, with configurable thresholds; bypassing can also be disabled * Throttling of the I/O to the SSD if it becomes congested, as detected by measured latency of the SSD's I/O operations exceeding a configurable threshold; useful for configurations having one SSD providing caching for many HDDs *
Readahead Readahead is a system call of the Linux kernel that loads a file's contents into the page cache. This prefetches the file so that when it is subsequently accessed, its contents are read from the main memory (RAM) rather than from a hard disk drive ...
on a
cache miss In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
(disabled by default) * Highly efficient write-back implementation dirty data is always written out in sorted order, and optionally background write-back is smoothly throttled down to keeping configured percentage of the cache dirty * High-performance B+ trees are used internally bcache is capable of around 1,000,000
IOPS Input/output operations per second (IOPS, pronounced ''eye-ops'') is an input/output performance measurement used to characterize computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area networks (SAN). Lik ...
on random reads, if the hardware is fast enough * Various runtime statistics and configuration options are exposed through
sysfs sysfs is a pseudo file system provided by the Linux kernel that exports information about various kernel subsystems, hardware devices, and associated device drivers from the kernel's device model to user space through virtual files. In addition ...


Improvements

, the following new features are planned for the future releases of bcache: * Awareness of
data striping In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices. Striping is useful when a processing device request ...
in RAID 5 and RAID 6 layouts adding awareness of the stripe layout to the write-back policy, so decisions on caching will be giving preference to already "dirty" stripes, and actual background flushes will be writing out complete stripes first * Handling cache misses with already full B+ tree nodes as of the bcache version in Linux kernel 3.10, splits of the internally used B+ tree nodes happen on writes, making initial cache warm-ups hardly achievable * Multiple SSDs in a cache set only dirty data (for the write-back policy) and
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
would be mirrored, without wasting SSD space for the clean data and read caches * Data
checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data ...
ming


See also

*
dm-cache dm-cache is a component (more specifically, a target) of the Linux kernel's device mapper, which is a framework for mapping block devices onto higher-level virtual block devices. It allows one or more fast storage devices, such as flash-base ...
a Linux kernel's device mapper target that allows creation of hybrid volumes *
EnhanceIO EnhanceIO is a disk cache module for the Linux kernel. Its goal is to use fast but relatively small SSD drives to improve the performance of large but slow hard drives. Overview EnhanceIO makes it possible to add an SSD or other fast disk device ...
a disk cache module for the Linux kernel. *
Flashcache Flashcache is a disk cache component for the Linux kernel, initially developed by Facebook since April 2010, and released as open source in 2011. Since January 2013, there is a fork of Flashcache, named EnhanceIO and developed by sTec, Inc. Si ...
a disk cache component for the Linux kernel, initially developed by Facebook *
Hybrid drive In computing, a hybrid drive (solid state hybrid drive – SSHD) is a logical or physical storage device that combines a faster storage medium such as solid-state drive (SSD) with a higher-capacity hard disk drive (HDD). The intent is adding s ...
a storage device that combines flash-based and spinning magnetic media storage technologies *
ReadyBoost ReadyBoost (codenamed EMD) is a disk caching software component developed by Microsoft for Windows Vista and included in later versions of Windows. ReadyBoost enables NAND memory mass storage CompactFlash, SD card, and USB flash drive devices t ...
a disk caching software component of Windows Vista and later Microsoft operating systems *
Smart Response Technology In computer data storage, Smart Response Technology (SRT, also called SSD Caching before it was launched) is a proprietary caching mechanism introduced in 2011 by Intel for their Z68 chipset (for the Sandy Bridge–series processors), which a ...
(SRT) a proprietary disk storage caching mechanism, developed by Intel for its chipsets


References


External links

*
LSFMM: Caching dm-cache and bcache
LWN.net LWN.net is a computing webzine with an emphasis on free software and software for Linux and other Unix-like operating systems. It consists of a weekly issue, separate stories which are published most days, and threaded discussion attached to ...
, May 1, 2013, by Jake Edge
Linux Block Caching Choices in Stable Upstream Kernel
(PDF),
Dell Dell is an American based technology company. It develops, sells, repairs, and supports computers and related products and services. Dell is owned by its parent company, Dell Technologies. Dell sells personal computers (PCs), servers, data ...
, December 2013 * Testing bcache series: , , , and , ''
Linux Magazine ''Linux Magazine'' is an international magazine for Linux software enthusiasts and professionals. It is published by the former Linux New Media division of the German media company Medialinx AG. The magazine was first published in German in 199 ...
'', August–September 2010, by Jeffrey B. Layton
Performance Comparison among EnhanceIO, bcache and dm-cache
LKML The Linux kernel mailing list (LKML) is the main electronic mailing list for Linux kernel development, where the majority of the announcements, discussions, debates, and flame wars over the kernel take place. Many other mailing lists exist to ...
, June 11, 2013
EnhanceIO, Bcache & DM-Cache Benchmarked
Phoronix Phoronix Test Suite (PTS) is a free and open-source benchmark software for Linux and other operating systems which is developed by Michael Larabel and Matthew Tippett. The Phoronix Test Suite has been endorsed by sites such as Linux.com, LinuxP ...
, June 11, 2013, by Michael Larabel {{Operating system Solid-state caching Free software programmed in C Linux kernel features