Networked Filesystem
   HOME

TheInfoList



OR:

A clustered file system is a
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
which is shared by being simultaneously
mounted Mount is often used as part of the name of specific mountains, e.g. Mount Everest. Mount or Mounts may also refer to: Places * Mount, Cornwall, a village in Warleggan parish, England * Mount, Perranzabuloe, a hamlet in Perranzabuloe parish, C ...
on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only
direct attached storage Direct-attached storage (DAS) is data storage device, digital storage directly attached to the computer accessing it, as opposed to storage accessed over a computer network (i.e. network-attached storage). DAS consists of one or more storage unit ...
for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.


Shared-disk file system

A shared-disk file system uses a
storage area network A storage area network (SAN) or storage network is a computer network which provides access to consolidated, block-level data storage. SANs are primarily used to access data storage devices, such as disk arrays and tape libraries from serve ...
(SAN) to allow multiple computers to gain direct disk access at the block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered file system, the shared-disk file system —by adding mechanisms for
concurrency control In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for Concurrent computing, concurrent operations a ...
—provides a consistent and serializable view of the file system, avoiding corruption and unintended
data loss Data loss is an error condition in information systems in which information is destroyed by failures (like failed spindle motors or head crashes on hard drives) or neglect (like mishandling, careless handling or storage under unsuitable conditions) ...
even when multiple clients try to access the same files at the same time. Shared-disk file-systems commonly employ some sort of
fencing Fencing is a group of three related combat sports. The three disciplines in modern fencing are the foil, the épée, and the sabre (also ''saber''); winning points are made through the weapon's contact with an opponent. A fourth discipline, s ...
mechanism to prevent data corruption in case of node failures, because an unfenced device can cause data corruption if it loses communication with its sister nodes and tries to access the same information other nodes are accessing. The underlying storage area network may use any of a number of block-level protocols, including
SCSI Small Computer System Interface (SCSI, ) is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, electrical, optical and logical interface ...
,
iSCSI Internet Small Computer Systems Interface or iSCSI ( ) is an Internet Protocol-based storage networking standard for linking data storage facilities. iSCSI provides block-level access to storage devices by carrying SCSI commands over a TCP/IP ...
,
HyperSCSI HyperSCSI is an outdated computer network Protocol (computing), protocol for accessing storage by sending and receiving SCSI commands. It was developed by researchers at the Data Storage Institute in Singapore in 2000 to 2003. HyperSCSI is unlike iS ...
, ATA over Ethernet (AoE),
Fibre Channel Fibre Channel (FC) is a high-speed data transfer protocol providing in-order, lossless delivery of raw block data. Fibre Channel is primarily used to connect computer data storage to servers in storage area networks (SAN) in commercial data cen ...
,
network block device On Linux, network block device (NBD) is a network protocol that can be used to forward a block device (typically a hard disk or partition) from one machine to a second machine. As an example, a local machine can access a hard disk drive that is a ...
, and
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
. There are different architectural approaches to a shared-disk filesystem. Some distribute file information across all the servers in a cluster (fully distributed).


Examples

*
Blue Whale Clustered file system Blue Whale Clustered file system (BWFS) is a shared disk file system (also called clustered file system, ''shared storage file systems'' or SAN file system) made by Tianjin Zhongke Blue Whale Information Technologies Company in China. Overview ...
(BWFS) *
Silicon Graphics Silicon Graphics, Inc. (stylized as SiliconGraphics before 1999, later rebranded SGI, historically known as Silicon Graphics Computer Systems or SGCS) was an American high-performance computing manufacturer, producing computer hardware and soft ...
(SGI) clustered file system (
CXFS The CXFS file system (Clustered XFS) is a proprietary software, proprietary shared disk file system designed by Silicon Graphics (SGI) specifically to be used in a storage area network (SAN) environment. A significant difference between CXFS and o ...
) *
Veritas Cluster File System The Veritas Cluster File System (or VxCFS) is a cache coherent POSIX compliant shared file system built based upon VERITAS File System. It is distributed with a built-in Cluster Volume Manager (VxCVM) and components of other VERITAS Storage Foundat ...
* Microsoft
Cluster Shared Volumes Cluster Shared Volumes (CSV) is a feature of Failover Clustering first introduced in Windows Server 2008 R2 for use with the Hyper-V role. A Cluster Shared Volume is a shared disk containing an NTFS or ReFS (ReFS: Windows Server 2012 R2 or newer) ...
(CSV) * DataPlow
Nasan The Nasan Clustered File System is a shared disk file system created by the company DataPlow. Nasan software enables high-speed access to shared files located on shared, storage area network (SAN)-attached storage devices by utilizing the high-perfo ...
File System *
IBM General Parallel File System GPFS (General Parallel File System, brand name IBM Spectrum Scale) is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It ...
(GPFS) *
Oracle Cluster File System The Oracle Cluster File System (OCFS, in its second version OCFS2) is a shared disk file system developed by Oracle Corporation and released under the GNU General Public License. The first version of OCFS was developed with the main focus to acco ...
(OCFS) *
OpenVMS OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Ope ...
Files-11 Files-11 is the file system used by Digital Equipment Corporation OpenVMS operating system, and also (in a simpler form) by the older RSX-11. It is a hierarchical file system, with support for access control lists, record-oriented I/O, remote ...
File System * PolyServe storage solutions *
Quantum In physics, a quantum (plural quanta) is the minimum amount of any physical entity (physical property) involved in an interaction. The fundamental notion that a physical property can be "quantized" is referred to as "the hypothesis of quantizati ...
StorNext StorNext File System (SNFS), colloquially referred to as StorNext is a shared disk file system made by Quantum Corporation. StorNext enables multiple Windows, Linux and Apple workstations to access shared block storage over a Fibre Channel networ ...
File System (SNFS), ex ADIC, ex CentraVision File System (CVFS) * Red Hat
Global File System In computing, the Global File System 2 or GFS2 is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file sys ...
(GFS2) * Sun
QFS QFS (Quick File System) is a filesystem from Oracle. It is tightly integrated with SAM, the Storage and Archive Manager, and hence is often referred to as SAM-QFS. SAM provides the functionality of a hierarchical storage manager. Features QFS ...
* TerraScale Technologies TerraFS * Veritas CFS (Cluster FS: Clustered VxFS) * Versity VSM (SAM-QFS ported to Linux), ScoutFS *
VMware VMFS VMware VMFS (Virtual Machine File System) is VMware, Inc.'s clustered file system used by the company's flagship server virtualization suite, vSphere. It was developed to store virtual machine disk images, including snapshots. Multiple servers c ...
* WekaFS * Apple
Xsan Xsan () is Apple Inc.'s storage area network (SAN) or clustered file system for macOS. Xsan enables multiple Mac desktop and Xserve systems to access shared block storage over a Fibre Channel network. With the Xsan file system installed, these ...
*
DragonFly BSD DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in Ju ...
HAMMER2 HAMMER2 is a successor to the HAMMER filesystem, redesigned from the ground up to support enhanced clustering. HAMMER2 supports online and batched deduplication, snapshots, directory entry indexing, multiple mountable filesystem roots, mounta ...


Distributed file systems

''Distributed file systems'' do not share block level access to the same storage but use a network
protocol Protocol may refer to: Sociology and politics * Protocol (politics), a formal agreement between nation states * Protocol (diplomacy), the etiquette of diplomacy and affairs of state * Etiquette, a code of personal behavior Science and technology ...
. These are commonly known as
network file system Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like ...
s, even though they are not the only file systems that use the network to send data. Distributed file systems can restrict access to the file system depending on access lists or capabilities on both the servers and the clients, depending on how the protocol is designed. The difference between a distributed file system and a
distributed data store A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''numb ...
is that a distributed file system allows files to be accessed using the same interfaces and semantics as local files for example, mounting/unmounting, listing directories, read/write at byte boundaries, system's native permission model. Distributed data stores, by contrast, require using a different API or library and have different semantics (most often those of a database).


Design goals

Distributed file systems may aim for "transparency" in a number of aspects. That is, they aim to be "invisible" to client programs, which "see" a system which is similar to a local file system. Behind the scenes, the distributed file system handles locating files, transporting data, and potentially providing other features listed below. * ''Access transparency'': clients are unaware that files are distributed and can access them in the same way as local files are accessed. * ''Location transparency'': a consistent namespace exists encompassing local as well as remote files. The name of a file does not give its location. * ''Concurrency transparency'': all clients have the same view of the state of the file system. This means that if one process is modifying a file, any other processes on the same system or remote systems that are accessing the files will see the modifications in a coherent manner. * ''Failure transparency'': the client and client programs should operate correctly after a server failure. * ''Heterogeneity'': file service should be provided across different hardware and operating system platforms. * ''Scalability'': the file system should work well in small environments (1 machine, a dozen machines) and also scale gracefully to bigger ones (hundreds through tens of thousands of systems). * ''Replication transparency'': Clients should not have to be aware of the file replication performed across multiple servers to support scalability. * ''Migration transparency'': files should be able to move between different servers without the client's knowledge.


History

The
Incompatible Timesharing System Incompatible Timesharing System (ITS) is a time-sharing operating system developed principally by the MIT Artificial Intelligence Laboratory, with help from Project MAC. The name is the jocular complement of the MIT Compatible Time-Sharing System ...
used virtual devices for transparent inter-machine file system access in the 1960s. More file servers were developed in the 1970s. In 1976
Digital Equipment Corporation Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president unt ...
created the
File Access Listener DECnet is a suite of network protocols created by Digital Equipment Corporation. Originally released in 1975 in order to connect two PDP-11 minicomputers, it evolved into one of the first peer-to-peer network architectures, thus transforming DEC ...
(FAL), an implementation of the
Data Access Protocol In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
as part of
DECnet DECnet is a suite of network protocols created by Digital Equipment Corporation. Originally released in 1975 in order to connect two PDP-11 minicomputers, it evolved into one of the first peer-to-peer network architectures, thus transforming DEC ...
Phase II which became the first widely used network file system. In 1985
Sun Microsystems Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, the ...
created the file system called "
Network File System Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like ...
" (NFS) which became the first widely used
Internet Protocol The Internet Protocol (IP) is the network layer communications protocol in the Internet protocol suite for relaying datagrams across network boundaries. Its routing function enables internetworking, and essentially establishes the Internet. IP h ...
based network file system. Other notable network file systems are
Andrew File System The Andrew File System (AFS) is a distributed file system which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University as part of t ...
(AFS),
Apple Filing Protocol The Apple Filing Protocol (AFP), formerly AppleTalk Filing Protocol, is a proprietary network protocol, and part of the Apple File Service (AFS), that offers file services for macOS and the classic Mac OS. In Mac OS 9 and earlier, AFP was the ...
(AFP),
NetWare Core Protocol The NetWare Core Protocol (NCP) is a network protocol used in some products from Novell, Inc. It is usually associated with the client-server operating system Novell NetWare which originally supported primarily MS-DOS client stations, but later su ...
(NCP), and
Server Message Block Server Message Block (SMB) is a communication protocol originally developed in 1983 by Barry A. Feigenbaum at IBM and intended to provide shared access to files and printers across nodes on a network of systems running IBM's OS/2. It also provides ...
(SMB) which is also known as Common Internet File System (CIFS). In 1986, IBM announced client and server support for Distributed Data Management Architecture (DDM) for the
System/36 The IBM System/36 (often abbreviated as S/36) was a midrange computer marketed by IBM from 1983 to 2000 - a multi-user, multi-tasking successor to the System/34. Like the System/34 and the older System/32, the System/36 was primarily progr ...
,
System/38 The System/38 is a discontinued minicomputer and midrange computer manufactured and sold by IBM. The system was announced in 1978. The System/38 has 48-bit addressing, which was unique for the time, and a novel integrated database system. It w ...
, and IBM mainframe computers running
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE. CICS family products ...
. This was followed by the support for
IBM Personal Computer The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the IBM PC model line and the basis for the IBM PC compatible de facto standard. Released on August 12, 1981, it was created by a team ...
,
AS/400 The IBM AS/400 (Application System/400) is a family of midrange computers from IBM announced in June 1988 and released in August 1988. It was the successor to the System/36 and System/38 platforms, and ran the OS/400 operating system. Lower-cost ...
, IBM mainframe computers under the
MVS Multiple Virtual Storage, more commonly called MVS, was the most commonly used operating system on the System/370 and System/390 IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated ...
and VSE operating systems, and
FlexOS FlexOS is a discontinued modular real-time multiuser computer multitasking, multitasking operating system (RTOS) designed for computer-integrated manufacturing, laboratory, retail and financial markets. Developed by Digital Research's Flexible ...
. DDM also became the foundation for Distributed Relational Database Architecture, also known as DRDA. There are many
peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer n ...
network protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics and synchroniza ...
s for open-source distributed file systems for cloud or closed-source clustered file systems, e. g.: 9P, AFS,
Coda Coda or CODA may refer to: Arts, entertainment, and media Films * Movie coda, a post-credits scene * ''Coda'' (1987 film), an Australian horror film about a serial killer, made for television *''Coda'', a 2017 American experimental film from Na ...
, CIFS/SMB, DCE/DFS,
WekaFS
Lustre Lustre or Luster may refer to: Places * Luster, Norway, a municipality in Vestlandet, Norway ** Luster (village), a village in the municipality of Luster * Lustre, Montana, an unincorporated community in the United States Entertainment * '' ...

PanFS
Google File System Google File System (GFS or GoogleFS, not to be confused with the GFS Linux file system) is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Goo ...
,
Mnet M-Net (an abbreviation of Electronic Media Network) is a South African pay television channel established by Naspers in 1986. The channel broadcasts both local and international programming, including general entertainment, children's series, ...
,
Chord Project In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A distributed hash table stores associative array, key-value pairs by assigning keys to different computers (known as "nodes"); a node will store the values ...
.


Examples

*
Alluxio Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis, advised by Professor Scott Shenker & ...
*
BeeGFS BeeGFS (formerly FhGFS) is a parallel file system, developed and optimized for high-performance computing. BeeGFS includes a distributed metadata architecture for scalability and flexibility reasons. Its most used and widely known aspect is data ...
(Fraunhofer) *
CephFS Ceph (pronounced ) is an open-source software-defined storage platform that implements object storage on a single distributed computer cluster and provides 3-in-1 interfaces for object-, block- and file-level storage. Ceph aims primarily f ...
(Inktank, Red Hat, SUSE) * Windows Distributed File System (DFS) (Microsoft) * Infinit (acquired by Docker) * GfarmFS *
GlusterFS Gluster Inc. (formerly known as Z RESEARCH) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineer ...
(Red Hat) * GFS (Google Inc.) * GPFS (IBM) *
HDFS Apache Hadoop () is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage a ...
(Apache Software Foundation) *
IPFS The InterPlanetary File System (IPFS) is a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespac ...
(Inter Planetary File System) * iRODS
JuiceFS
(Juicedata) *
LizardFS LizardFS is an open source distributed file system that is POSIX-compliant and licensed under GPLv3. It was released in 2013 as fork of MooseFS. LizardFS is also offering a paid Technical Support (Standard, Enterprise and Enterprise Plus) with p ...
(Skytechnology) *
Lustre Lustre or Luster may refer to: Places * Luster, Norway, a municipality in Vestlandet, Norway ** Luster (village), a village in the municipality of Luster * Lustre, Montana, an unincorporated community in the United States Entertainment * '' ...
*
MapR FS The MapR File System (MapR FS) is a clustered file system that supports both very large-scale and high-performance uses. MapR FS supports a variety of interfaces including conventional read/write file access via NFS and a FUSE interface, as well ...
*
MooseFS Moose File System (MooseFS) is an open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file system ...
(Core Technology / Gemius) *
ObjectiveFS ObjectiveFS is a distributed file system developed by Objective Security Corp. It is a POSIX-compliant file system built with an object store backend.Christophe Bard"LeMagIT: ObjectiveFS: a POSIX file system on top of S3 (original article in Frenc ...
*
OneFS The OneFS File System is a parallel distributed networked file system designed by Isilon Systems and is the basis for the ''Isilon Scale-out Storage Platform''. The OneFS file system is controlled and managed by the OneFS Operating System, a Fre ...
(EMC Isilon) *
OrangeFS OrangeFS is an open-source parallel file system, the next generation of Parallel Virtual File System (PVFS). A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurr ...
(Clemson University, Omnibond Systems), formerly
Parallel Virtual File System The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of ...
* PanFS (Panasas) *
Parallel Virtual File System The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of ...
(Clemson University, Argonne National Laboratory, Ohio Supercomputer Center) *
RozoFS RozoFS is a free software distributed file system. It comes as a free software, licensed under the GNU GPL v2. RozoFS uses erasure coding for redundancy. Design Rozo provides an open source POSIX filesystem, built on top of distributed file s ...
(Rozo Systems) *
SMB/CIFS Server Message Block (SMB) is a communication protocol originally developed in 1983 by Barry A. Feigenbaum at IBM and intended to provide shared access to files and printers across nodes on a network of systems running IBM's OS/2. It also provides ...
* Torus (CoreOS) * WekaFS (WekaIO) *
XtreemFS XtreemFS is an object-based, distributed file system for wide area networks.F. Hupfeld, T. Cortes, B. Kolbeck, E. Focht, M. Hess, J. Malo, J. Marti, J. Stender, E. Cesario"XtreemFS - a case for object-based storage in Grid data management" VLDB W ...


Network-attached storage

Network-attached storage (NAS) provides both storage and a file system, like a shared disk file system on top of a storage area network (SAN). NAS typically uses file-based protocols (as opposed to block-based protocols a SAN would use) such as NFS (popular on
UNIX Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
systems), SMB/CIFS ( Server Message Block/Common Internet File System) (used with MS Windows systems), AFP (used with
Apple Macintosh The Mac (known as Macintosh until 1999) is a family of personal computers designed and marketed by Apple Inc. Macs are known for their ease of use and minimalist designs, and are popular among students, creative professionals, and software en ...
computers), or NCP (used with
OES Oes or owes were metallic "O" shaped rings or eyelets sewn on to clothes and furnishing textiles for decorative effect in England and at the Elizabethan and Jacobean court. They were smaller than modern sequins. Making and metals Robert Sharp obta ...
and
Novell NetWare NetWare is a discontinued computer network operating system developed by Novell, Inc. It initially used cooperative multitasking to run various services on a personal computer, using the IPX network protocol. The original NetWare product in 19 ...
).


Design considerations


Avoiding single point of failure

The failure of disk hardware or a given storage node in a cluster can create a
single point of failure A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software appl ...
that can result in
data loss Data loss is an error condition in information systems in which information is destroyed by failures (like failed spindle motors or head crashes on hard drives) or neglect (like mishandling, careless handling or storage under unsuitable conditions) ...
or unavailability.
Fault tolerance Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...
and high availability can be provided through
data replication Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. Terminology Replication in comp ...
of one sort or another, so that data remains intact and available despite the failure of any single piece of equipment. For examples, see the lists of distributed fault-tolerant file systems and distributed parallel fault-tolerant file systems.


Performance

A common
performance A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Management science In the work place ...
measurement Measurement is the quantification of attributes of an object or event, which can be used to compare with other objects or events. In other words, measurement is a process of determining how large or small a physical quantity is as compared ...
of a clustered file system is the amount of time needed to satisfy service requests. In conventional systems, this time consists of a disk-access time and a small amount of
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
-processing time. But in a clustered file system, a remote access has additional overhead due to the distributed structure. This includes the time to deliver the request to a server, the time to deliver the response to the client, and for each direction, a CPU overhead of running the
communication protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics (computer scien ...
software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...
.


Concurrency

Concurrency control becomes an issue when more than one person or client is accessing the same file or block and want to update it. Hence updates to the file from one client should not interfere with access and updates from other clients. This problem is more complex with file systems due to concurrent overlapping writes, where different writers write to overlapping regions of the file concurrently.Pessach, Yaniv (2013). ''Distributed Storage: Concepts, Algorithms, and Implementations''. . This problem is usually handled by
concurrency control In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for Concurrent computing, concurrent operations a ...
or locking which may either be built into the file system or provided by an add-on protocol.


History

IBM mainframes in the 1970s could share physical disks and file systems if each machine had its own channel connection to the drives' control units. In the 1980s,
Digital Equipment Corporation Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president unt ...
's
TOPS-20 The TOPS-20 operating system by Digital Equipment Corporation (DEC) is a proprietary OS used on some of DEC's 36-bit mainframe computers. The Hardware Reference Manual was described as for "DECsystem-10/DECSYSTEM-20 Processor" (meaning the DEC PDP- ...
and
OpenVMS OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Ope ...
clusters (VAX/ALPHA/IA64) included shared disk file systems.


See also

*
Distributed file system A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
*
Network-attached storage Network-attached storage (NAS) is a file-level (as opposed to block-level storage) computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the tech ...
*
Storage area network A storage area network (SAN) or storage network is a computer network which provides access to consolidated, block-level data storage. SANs are primarily used to access data storage devices, such as disk arrays and tape libraries from serve ...
*
Shared resource In computing, a shared resource, or network share, is a System resource, computer resource made available from one Host (network), host to other hosts on a computer network. It is a device or piece of information on a computer that can be remot ...
*
Direct-attached storage Direct-attached storage (DAS) is digital storage directly attached to the computer accessing it, as opposed to storage accessed over a computer network (i.e. network-attached storage). DAS consists of one or more storage units such as hard drives ...
*
Peer-to-peer file sharing Peer-to-peer file sharing is the distribution and sharing of digital media using peer-to-peer (P2P) networking technology. P2P file sharing allows users to access media files such as books, music, movies, and games using a P2P software program tha ...
*
Disk sharing Disk or disc may refer to: * Disk (mathematics), a geometric shape * Disk storage Music * Disc (band), an American experimental music band * ''Disk'' (album), a 1995 EP by Moby Other uses * Disk (functional analysis), a subset of a vector space ...
*
Distributed data store A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''numb ...
*
Distributed file system for cloud A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations (create, delete, modify, read, write) on that data. Each data file may be partitioned into several parts called chunks. E ...
*
Global file system In computing, the Global File System 2 or GFS2 is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file sys ...
*
Gopher (protocol) The Gopher protocol () is a communication protocol designed for distributing, searching, and retrieving documents in Internet Protocol networks. The design of the Gopher protocol and user interface is menu-driven, and presented an alternative to ...
* List of distributed file systems *
CacheFS CacheFS is the name used for several similar software technologies designed to speed up distributed file system file access for networked computers. These technologies operate by storing ( cached) copies of files on secondary memory, typically a loc ...
*
RAID Raid, RAID or Raids may refer to: Attack * Raid (military), a sudden attack behind the enemy's lines without the intention of holding ground * Corporate raid, a type of hostile takeover in business * Panty raid, a prankish raid by male college ...


References


Further reading


A Taxonomy of Distributed Storage Systems

A Taxonomy and Survey on Distributed File Systems

A survey of distributed file systems

The Evolution of File Systems
{{File systems, state=collapsed Computer file systems Data management Distributed data storage Network file systems Storage area networks