A clustered file system is a
file system
In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
which is shared by being simultaneously
mounted on multiple
servers. There are several approaches to
clustering, most of which do not employ a clustered file system (only
direct attached storage
Direct-attached storage (DAS) is data storage device, digital storage directly attached to the computer accessing it, as opposed to storage accessed over a computer network (i.e. network-attached storage). DAS consists of one or more storage unit ...
for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.
Shared-disk file system
A shared-disk file system uses a
storage area network (SAN) to allow multiple computers to gain direct disk access at the
block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered file system, the shared-disk file system —by adding mechanisms for
concurrency control—provides a consistent and
serializable view of the file system, avoiding corruption and unintended
data loss even when multiple clients try to access the same files at the same time. Shared-disk file-systems commonly employ some sort of
fencing mechanism to prevent data corruption in case of node failures, because an unfenced device can cause data corruption if it loses communication with its sister nodes and tries to access the same information other nodes are accessing.
The underlying storage area network may use any of a number of block-level protocols, including
SCSI
Small Computer System Interface (SCSI, ) is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, electrical, optical and logical interface ...
,
iSCSI,
HyperSCSI HyperSCSI is an outdated computer network Protocol (computing), protocol for accessing storage by sending and receiving SCSI commands. It was developed by researchers at the Data Storage Institute in Singapore in 2000 to 2003.
HyperSCSI is unlike iS ...
,
ATA over Ethernet (AoE),
Fibre Channel
Fibre Channel (FC) is a high-speed data transfer protocol providing in-order, lossless delivery of raw block data. Fibre Channel is primarily used to connect computer data storage to servers in storage area networks (SAN) in commercial data cen ...
,
network block device
On Linux, network block device (NBD) is a network protocol that can be used to forward a block device (typically a hard disk or partition) from one machine to a second machine. As an example, a local machine can access a hard disk drive that is a ...
, and
InfiniBand.
There are different architectural approaches to a shared-disk filesystem. Some distribute file information across all the servers in a cluster (fully distributed).
Examples
*
Blue Whale Clustered file system (BWFS)
*
Silicon Graphics (SGI) clustered file system (
CXFS)
*
Veritas Cluster File System The Veritas Cluster File System (or VxCFS) is a cache coherent POSIX compliant shared file system built based upon VERITAS File System. It is distributed with a built-in Cluster Volume Manager (VxCVM) and components of other VERITAS Storage Foundat ...
* Microsoft
Cluster Shared Volumes (CSV)
* DataPlow
Nasan File System
*
IBM General Parallel File System (GPFS)
*
Oracle Cluster File System
The Oracle Cluster File System (OCFS, in its second version OCFS2) is a shared disk file system developed by Oracle Corporation and released under the GNU General Public License.
The first version of OCFS was developed with the main focus to acco ...
(OCFS)
*
OpenVMS
OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Ope ...
Files-11 File System
* PolyServe storage solutions
*
Quantum
In physics, a quantum (plural quanta) is the minimum amount of any physical entity (physical property) involved in an interaction. The fundamental notion that a physical property can be "quantized" is referred to as "the hypothesis of quantizati ...
StorNext File System (SNFS), ex ADIC, ex CentraVision File System (CVFS)
* Red Hat
Global File System (GFS2)
* Sun
QFS
* TerraScale Technologies TerraFS
* Veritas CFS (Cluster FS: Clustered VxFS)
* Versity VSM (SAM-QFS ported to Linux), ScoutFS
*
VMware VMFS
* WekaFS
* Apple
Xsan
Xsan () is Apple Inc.'s storage area network (SAN) or clustered file system for macOS. Xsan enables multiple Mac desktop and Xserve systems to access shared block storage over a Fibre Channel network. With the Xsan file system installed, these ...
*
DragonFly BSD HAMMER2
Distributed file systems
''Distributed file systems'' do not share
block level access to the same storage but use a network
protocol.
These are commonly known as
network file systems, even though they are not the only file systems that use the network to send data.
Distributed file systems can restrict access to the file system depending on
access list
In computer security, an access-control list (ACL) is a list of permissions associated with a system resource (object). An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on give ...
s or
capabilities on both the servers and the clients, depending on how the protocol is designed.
The difference between a distributed file system and a
distributed data store is that a distributed file system allows files to be accessed using the same interfaces and semantics as local files for example, mounting/unmounting, listing directories, read/write at byte boundaries, system's native permission model. Distributed data stores, by contrast, require using a different API or library and have different semantics (most often those of a database).
Design goals
Distributed file systems may aim for "transparency" in a number of aspects. That is, they aim to be "invisible" to client programs, which "see" a system which is similar to a local file system. Behind the scenes, the distributed file system handles locating files, transporting data, and potentially providing other features listed below.
* ''Access transparency'': clients are unaware that files are distributed and can access them in the same way as local files are accessed.
* ''Location transparency'': a consistent namespace exists encompassing local as well as remote files. The name of a file does not give its location.
* ''Concurrency transparency'': all clients have the same view of the state of the file system. This means that if one process is modifying a file, any other processes on the same system or remote systems that are accessing the files will see the modifications in a coherent manner.
* ''Failure transparency'': the client and client programs should operate correctly after a server failure.
* ''Heterogeneity'': file service should be provided across different hardware and operating system platforms.
* ''Scalability'': the file system should work well in small environments (1 machine, a dozen machines) and also scale gracefully to bigger ones (hundreds through tens of thousands of systems).
* ''Replication transparency'': Clients should not have to be aware of the file replication performed across multiple servers to support scalability.
* ''Migration transparency'': files should be able to move between different servers without the client's knowledge.
History
The
Incompatible Timesharing System used virtual devices for transparent inter-machine file system access in the 1960s. More file servers were developed in the 1970s. In 1976
Digital Equipment Corporation created the
File Access Listener
DECnet is a suite of network protocols created by Digital Equipment Corporation. Originally released in 1975 in order to connect two PDP-11 minicomputers, it evolved into one of the first peer-to-peer network architectures, thus transforming DEC ...
(FAL), an implementation of the
Data Access Protocol
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
as part of
DECnet Phase II which became the first widely used network file system. In 1985
Sun Microsystems
Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, the ...
created the file system called "
Network File System" (NFS) which became the first widely used
Internet Protocol based network file system.
Other notable network file systems are
Andrew File System (AFS),
Apple Filing Protocol (AFP),
NetWare Core Protocol (NCP), and
Server Message Block (SMB) which is also known as Common Internet File System (CIFS).
In 1986,
IBM announced client and server support for Distributed Data Management Architecture (DDM) for the
System/36,
System/38, and IBM mainframe computers running
CICS. This was followed by the support for
IBM Personal Computer,
AS/400, IBM mainframe computers under the
MVS and
VSE operating systems, and
FlexOS. DDM also became the foundation for
Distributed Relational Database Architecture, also known as DRDA.
There are many
peer-to-peer network protocols for open-source
distributed file systems for cloud
A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations (create, delete, modify, read, write) on that data. Each data file may be partitioned into several parts called chunks. Ea ...
or closed-source clustered file systems, e. g.:
9P,
AFS AFS is an initialism that may refer to:
Computing
* Andrew File System, a distributed networked file system
** OpenAFS, an open source implementation of the Andrew File System
* Apple File Service, implementing the Apple Filing Protocol
* Apple Fi ...
,
Coda
Coda or CODA may refer to:
Arts, entertainment, and media Films
* Movie coda, a post-credits scene
* ''Coda'' (1987 film), an Australian horror film about a serial killer, made for television
*''Coda'', a 2017 American experimental film from Na ...
,
CIFS/SMB,
DCE/DFS,
WekaFS LustrePanFS Google File System,
Mnet,
Chord Project.
Examples
*
Alluxio
Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis,
advised by Professor Scott Shenker ...
*
BeeGFS
BeeGFS (formerly FhGFS) is a parallel file system, developed and optimized for high-performance computing. BeeGFS includes a distributed metadata architecture for scalability and flexibility reasons. Its most used and widely known aspect is data ...
(Fraunhofer)
*
CephFS
Ceph (pronounced ) is an open-source software-defined storage platform that implements object storage on a single distributed computer cluster and provides 3-in-1 interfaces for object-, block- and file-level storage. Ceph aims primarily ...
(Inktank, Red Hat, SUSE)
*
Windows Distributed File System (DFS) (Microsoft)
*
Infinit (acquired by Docker)
*
GfarmFS
*
GlusterFS (Red Hat)
*
GFS (Google Inc.)
*
GPFS (IBM)
*
HDFS (Apache Software Foundation)
*
IPFS (Inter Planetary File System)
* iRODS
JuiceFS(Juicedata)
*
LizardFS (Skytechnology)
*
Lustre
*
MapR FS
*
MooseFS
Moose File System (MooseFS) is an open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file system ...
(Core Technology / Gemius)
*
ObjectiveFS
ObjectiveFS is a distributed file system developed by Objective Security Corp. It is a POSIX-compliant file system built with an object store backend.Christophe Bard"LeMagIT: ObjectiveFS: a POSIX file system on top of S3 (original article in Frenc ...
*
OneFS (EMC Isilon)
*
OrangeFS
OrangeFS is an open-source parallel file system, the next generation of Parallel Virtual File System (PVFS). A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurr ...
(Clemson University, Omnibond Systems), formerly
Parallel Virtual File System
The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of ...
*
PanFS (Panasas)
*
Parallel Virtual File System
The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of ...
(Clemson University, Argonne National Laboratory, Ohio Supercomputer Center)
*
RozoFS (Rozo Systems)
*
SMB/CIFS
* Torus (CoreOS)
*
WekaFS (WekaIO)
*
XtreemFS
XtreemFS is an object-based, distributed file system for wide area networks.F. Hupfeld, T. Cortes, B. Kolbeck, E. Focht, M. Hess, J. Malo, J. Marti, J. Stender, E. Cesario"XtreemFS - a case for object-based storage in Grid data management" VLDB W ...
Network-attached storage
Network-attached storage (NAS) provides both storage and a file system, like a shared disk file system on top of a storage area network (SAN). NAS typically uses file-based protocols (as opposed to block-based protocols a SAN would use) such as
NFS (popular on
UNIX systems), SMB/CIFS (
Server Message Block/Common Internet File System) (used with MS Windows systems),
AFP (used with
Apple Macintosh computers), or NCP (used with
OES and
Novell NetWare).
Design considerations
Avoiding single point of failure
The failure of disk hardware or a given storage node in a cluster can create a
single point of failure that can result in
data loss or unavailability.
Fault tolerance and high availability can be provided through
data replication of one sort or another, so that data remains intact and available despite the failure of any single piece of equipment. For examples, see the lists of
distributed fault-tolerant file systems and
distributed parallel fault-tolerant file systems
The following lists identify, characterize, and link to more thorough information on Computer file systems.
Many older operating systems support only their one "native" file system, which does not bear any name apart from the name of the operatin ...
.
Performance
A common
performance
A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function.
Management science
In the work place ...
measurement
Measurement is the quantification of attributes of an object or event, which can be used to compare with other objects or events.
In other words, measurement is a process of determining how large or small a physical quantity is as compared ...
of a clustered file system is the amount of time needed to satisfy service requests. In conventional systems, this time consists of a disk-access time and a small amount of
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
-processing time. But in a clustered file system, a remote access has additional overhead due to the distributed structure. This includes the time to deliver the request to a server, the time to deliver the response to the client, and for each direction, a CPU overhead of running the
communication protocol software.
Concurrency
Concurrency control becomes an issue when more than one person or client is accessing the same file or block and want to update it. Hence updates to the file from one client should not interfere with access and updates from other clients. This problem is more complex with file systems due to concurrent overlapping writes, where different writers write to overlapping regions of the file concurrently.
[Pessach, Yaniv (2013). ''Distributed Storage: Concepts, Algorithms, and Implementations''. .] This problem is usually handled by
concurrency control or
locking which may either be built into the file system or provided by an add-on protocol.
History
IBM mainframes in the 1970s could share physical disks and file systems if each machine had its own channel connection to the drives' control units. In the 1980s,
Digital Equipment Corporation's
TOPS-20 and
OpenVMS
OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Ope ...
clusters (VAX/ALPHA/IA64) included shared disk file systems.
See also
*
Distributed file system
*
Network-attached storage
*
Storage area network
*
Shared resource
*
Direct-attached storage
*
Peer-to-peer file sharing
*
Disk sharing
*
Distributed data store
*
Distributed file system for cloud
*
Global file system
*
Gopher (protocol)
The Gopher protocol () is a communication protocol designed for distributing, searching, and retrieving documents in Internet Protocol networks. The design of the Gopher protocol and user interface is menu-driven, and presented an alternative to ...
*
List of distributed file systems
*
CacheFS CacheFS is the name used for several similar software technologies designed to speed up distributed file system file access for networked computers. These technologies operate by storing ( cached) copies of files on secondary memory, typically a loc ...
*
RAID
References
Further reading
A Taxonomy of Distributed Storage SystemsA Taxonomy and Survey on Distributed File SystemsA survey of distributed file systemsThe Evolution of File Systems
{{File systems, state=collapsed
Computer file systems
Data management
Distributed data storage
Network file systems
Storage area networks