HOME

TheInfoList



OR:

Gluster Inc. (formerly known as Z RESEARCH) was a
software company A software company is a company whose primary products are various forms of software, software technology, distribution, and software product development. They make up the software industry. Types There are a number of different types of soft ...
that provided an open source platform for
scale-out Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineering center in
Bangalore Bangalore (), List of renamed places in India, officially Bengaluru (), is the Capital city, capital and largest city of the Indian state of Karnataka. It has a population of more than and a metropolitan area, metropolitan population of a ...
, India. Gluster was funded by Nexus Venture Partners and
Index Ventures Index Ventures is a European venture capital firm with dual headquarters in San Francisco and London, investing in technology-enabled companies with a focus on e-commerce, fintech, mobility, gaming, infrastructure/ AI, and security. Since its ...
. Gluster was acquired by Red Hat on October 7, 2011.


History

The name ''Gluster'' comes from the combination of the terms ''
GNU GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
'' and ''cluster''. Despite the similarity in names, Gluster is not related to the Lustre file system and does not incorporate any Lustre code. Gluster based its product on ''GlusterFS'', an open-source software-based network-attached
filesystem In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is Computer data storage, stored and retrieved. Without a file system, data placed in a storage me ...
that deploys on commodity hardware. The initial version of GlusterFS was written by Anand Babu Periasamy, Gluster's founder and CTO. In May 2010 Ben Golub became the president and chief executive officer. Red Hat became the primary author and maintainer of the GlusterFS open-source project after acquiring the Gluster company in October 2011. The product was first marketed as Red Hat Storage Server, but in early 2015 renamed to be Red Hat Gluster Storage since Red Hat has also acquired the Ceph file system technology. Red Hat Gluster Storage is in the retirement phase of its lifecycle with a end of support life date of December 31, 2024.


Architecture

The GlusterFS architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage (configured as
direct-attached storage Direct-attached storage (DAS) is digital storage directly attached to the computer accessing it, as opposed to storage accessed over a computer network (i.e. network-attached storage). DAS consists of one or more storage units such as hard drive ...
,
JBOD The most widespread standard for configuring multiple hard disk drives is RAID (Redundant Array of Inexpensive/Independent Disks), which comes in a number of standard configurations and non-standard configurations. Non-RAID drive architectures a ...
, or using a
storage area network A storage area network (SAN) or storage network is a computer network which provides access to consolidated, block-level data storage. SANs are primarily used to access data storage devices, such as disk arrays and tape libraries from ser ...
) is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.


Public cloud deployment

For public cloud deployments, GlusterFS offers an
Amazon Web Services Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide d ...
(AWS)
Amazon Machine Image An Amazon Machine Image (AMI) is a special type of virtual appliance that is used to create a virtual machine within the Amazon Elastic Compute Cloud ("EC2"). It serves as the basic unit of deployment for services delivered using EC2. Contents ...
(AMI), which is deployed on Elastic Compute Cloud (EC2) instances rather than physical servers and the underlying storage is Amazon's
Elastic Block Storage Amazon Elastic Block Store (EBS) provides raw block-level storage that can be attached to Amazon EC2 instances and is used by Amazon Relational Database Service (RDS). It is one of the two block-storage options offered by AWS, with the other b ...
(EBS). In this environment, capacity is scaled by deploying more EBS storage units, performance is scaled by deploying more EC2 instances, and availability is scaled by n-way replication between AWS availability zones.


Private cloud deployment

A typical on-premises, or private cloud deployment will consist of GlusterFS installed as a virtual appliance on top of multiple commodity servers running
hypervisor A hypervisor (also known as a virtual machine monitor, VMM, or virtualizer) is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called ...
s such as KVM, Xen, or
VMware VMware, Inc. is an American cloud computing and virtualization technology company with headquarters in Palo Alto, California. VMware was the first commercially successful company to virtualize the x86 architecture. VMware's desktop software ru ...
; or on bare metal.


GlusterFS

GlusterFS is a
scale-out Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
network-attached storage Network-attached storage (NAS) is a file-level (as opposed to block-level storage) computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the tech ...
file system. It has found applications including
cloud computing Cloud computing is the on-demand availability of computer system resources, especially data storage ( cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over mu ...
, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by Red Hat, Inc., as a result of Red Hat acquiring Gluster in 2011. In June 2012,
Red Hat Storage Server Red Hat Gluster Storage, formerly Red Hat Storage Server, is a computer storage product from Red Hat. It is based on open source technologies such as GlusterFS and Red Hat Enterprise Linux. The latest release, RHGS 3.5, combines Red Hat Enterpris ...
was announced as a commercially supported integration of GlusterFS with
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial open-source Linux distribution developed by Red Hat for the commercial market. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop version ...
. Red Hat bought
Inktank Storage Inktank Storage was the lead development contributor and financial sponsor company behind the open source Ceph distributed file system. Inktank was founded by Sage Weil and Bryan Bogensberger and initially funded by DreamHost, Citrix and Mark Shu ...
in April 2014, which is the company behind the Ceph distributed file system, and re-branded GlusterFS-based Red Hat Storage Server to "Red Hat Gluster Storage".


Design

GlusterFS aggregates various storage servers over
Ethernet Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
or
Infiniband InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
RDMA interconnect into one large parallel network file system. It is free software, with some parts licensed under the GNU General Public License (GPL) v3 while others are dual licensed under either GPL v2 or the Lesser General Public License (LGPL) v3. GlusterFS is based on a stackable user space design. GlusterFS has a client and server component. Servers are typically deployed as ''storage bricks'', with each server running a daemon to export a local file system as a ''
volume Volume is a measure of occupied three-dimensional space. It is often quantified numerically using SI derived units (such as the cubic metre and litre) or by various imperial or US customary units (such as the gallon, quart, cubic inch). The de ...
''. The client process, which connects to servers with a custom protocol over
TCP/IP The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suit ...
, InfiniBand or
Sockets Direct Protocol The Sockets Direct Protocol (SDP) is a transport-agnostic protocol to support stream sockets over remote direct memory access (RDMA) network fabrics. SDP was originally defined by the Software Working Group (SWG) of the InfiniBand Trade Associatio ...
, creates composite virtual volumes from multiple remote servers using stackable ''translators''. By default, files are stored whole, but
striping In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices. Striping is useful when a processing device request ...
of files across multiple remote volumes is also possible. The client may mount the composite volume using a GlusterFS native protocol via the
FUSE Fuse or FUSE may refer to: Devices * Fuse (electrical), a device used in electrical systems to protect against excessive current ** Fuse (automotive), a class of fuses for vehicles * Fuse (hydraulic), a device used in hydraulic systems to protect ...
mechanism or using NFS v3 protocol using a built-in server translator, or access the volume via the client library. The client may re-export a native-protocol mount, for example via the kernel
NFSv4 Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like ...
server,
SAMBA Samba (), also known as samba urbano carioca (''urban Carioca samba'') or simply samba carioca (''Carioca samba''), is a Brazilian music genre that originated in the Afro-Brazilian communities of Rio de Janeiro in the early 20th century. Havin ...
, or the object-based
OpenStack OpenStack is a free, open standard cloud computing platform. It is mostly deployed as infrastructure-as-a-service (IaaS) in both public and private clouds where virtual servers and other resources are made available to users. The software plat ...
Storage (Swift) protocol using the "UFO" (Unified File and Object) translator. Most of the functionality of GlusterFS is implemented as translators, including file-based mirroring and replication, file-based
striping In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices. Striping is useful when a processing device request ...
, file-based load balancing, volume failover,
scheduling A schedule or a timetable, as a basic time-management tool, consists of a list of times at which possible task (project management), tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order ...
and disk caching, storage quotas, and volume
snapshots Snapshot, snapshots or snap shot may refer to: * Snapshot (photography), a photograph taken without preparation Computing * Snapshot (computer storage), the state of a system at a particular point in time * Snapshot (file format) or SNP, a file ...
with user serviceability (since GlusterFS version 3.6). The GlusterFS server is intentionally kept simple: it exports an existing
directory Directory may refer to: * Directory (computing), or folder, a file system structure in which to store computer files * Directory (OpenVMS command) * Directory service, a software application for organizing information about a computer network's u ...
as-is, leaving it up to client-side translators to structure the store. The clients themselves are stateless, do not communicate with each other, and are expected to have translator configurations consistent with each other. GlusterFS relies on an elastic
hashing Hash, hashes, hash mark, or hashing may refer to: Substances * Hash (food), a coarse mixture of ingredients * Hash, a nickname for hashish, a cannabis product Hash mark * Hash mark (sports), a marking on hockey rinks and gridiron football fiel ...
algorithm, rather than using either a centralized or distributed metadata model. The user can add, delete, or migrate volumes dynamically, which helps to avoid configuration coherency problems. This allows GlusterFS to scale up to several
petabyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s on
commodity hardware Commodity computing (also known as commodity cluster computing) involves the use of large numbers of already-available computing components for parallel computing, to get the greatest amount of useful computation at low cost. It is computing done i ...
by avoiding bottlenecks that normally affect more tightly coupled distributed file systems. GlusterFS provides data reliability and availability through various kinds of replication: replicated volumes and
Geo-replication Geo-replication systems are designed to provide improved availability and disaster tolerance by using geographically distributed data centers. This is intended to improve the response time for applications such as web portals. Geo-replication can b ...
. Replicated volumes ensure that there exists at least one copy of each file across the bricks, so if one fails, data is still stored and accessible. Geo-replication provides a master-slave model of replication, where volumes are copied across geographically distinct locations. This happens asynchronously and is useful for availability in case of a whole data center failure. GlusterFS has been used as the foundation for academic research and a survey article. Red Hat markets the software for three markets: "on-premises",
public cloud Cloud computing is the on-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over multi ...
and "private cloud".


See also

* BeeGFS *
Ceph (software) Ceph (pronounced ) is an open-source software-defined storage platform that implements object storage on a single distributed computer cluster and provides 3-in-1 interfaces for object-, block- and file-level storage. Ceph aims primarily f ...
*
Distributed file system A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
* Distributed parallel fault-tolerant file systems * Gfarm file system *
IBM Spectrum Scale GPFS (General Parallel File System, brand name IBM Spectrum Scale) is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It i ...
(GPFS) *
LizardFS LizardFS is an open source distributed file system that is POSIX-compliant and licensed under GPLv3. It was released in 2013 as fork of MooseFS. LizardFS is also offering a paid Technical Support (Standard, Enterprise and Enterprise Plus) with p ...
* Lustre *
MapR FS The MapR File System (MapR FS) is a clustered file system that supports both very large-scale and high-performance uses. MapR FS supports a variety of interfaces including conventional read/write file access via NFS and a FUSE interface, as well ...
*
Moose File System Moose File System (MooseFS) is an open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file syste ...
* OrangeFS * Parallel Virtual File System *
Quantcast File System Quantcast File System (QFS) is an open-source distributed file system software package for large-scale MapReduce or other batch-processing workloads. It was designed as an alternative to the Apache Hadoop Distributed File System (HDFS), intended ...
*
RozoFS RozoFS is a free software distributed file system. It comes as a free software, licensed under the GNU GPL v2. RozoFS uses erasure coding for redundancy. Design Rozo provides an open source POSIX filesystem, built on top of distributed file s ...
* XtreemFS *
ZFS ZFS (previously: Zettabyte File System) is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an ope ...


References


External links

*{{official website, https://www.gluster.org/ Computer storage companies Software companies based in the San Francisco Bay Area Cloud storage Companies based in Sunnyvale, California Software companies of the United States