HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, the Global File System 2 or GFS2 is a
shared-disk file system A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
for
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared
block storage In computing (specifically data transmission and data storage), a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a maximum length; a ''block size''. Data thu ...
, in contrast to
distributed file systems A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
which distribute data throughout the cluster. GFS2 can also be used as a local file system on a single computer. GFS2 has no disconnected operating-mode, and no client or server roles. All nodes in a GFS2 cluster function as peers. Using GFS2 in a cluster requires hardware to allow access to the shared storage, and a lock manager to control access to the storage. The lock manager operates as a separate module: thus GFS2 can use the
Distributed Lock Manager Operating systems use lock managers to organise and serialise the access to resources. A distributed lock manager (DLM) runs in every machine in a cluster, with an identical copy of a cluster-wide lock database. In this way a DLM provides software ...
(DLM) for
cluster may refer to: Science and technology Astronomy * Cluster (spacecraft), constellation of four European Space Agency spacecraft * Asteroid cluster, a small asteroid family * Cluster II (spacecraft), a European Space Agency mission to study t ...
configurations and the "nolock" lock manager for local filesystems. Older versions of GFS also support GULM, a server-based lock manager which implements redundancy via failover. GFS and GFS2 are
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
, distributed under the terms of the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the Four Freedoms (Free software), four freedoms to run, study, share, and modify the software. The license was th ...
.


History

Development of GFS began in 1995 and was originally developed by
University of Minnesota The University of Minnesota, formally the University of Minnesota, Twin Cities, (UMN Twin Cities, the U of M, or Minnesota) is a public university, public Land-grant university, land-grant research university in the Minneapolis–Saint Paul, Tw ...
professor Matthew O'Keefe and a group of students. It was originally written for
SGI SGI may refer to: Companies *Saskatchewan Government Insurance *Scientific Games International, a gambling company *Silicon Graphics, Inc., a former manufacturer of high-performance computing products *Silicon Graphics International, formerly Rac ...
's
IRIX IRIX ( ) is a discontinued operating system developed by Silicon Graphics (SGI) to run on the company's proprietary MIPS workstations and servers. It is based on UNIX System V with BSD extensions. In IRIX, SGI originated the XFS file system and ...
operating system, but in 1998 it was ported to
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
since the
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
code provided a more convenient development platform. In late 1999/early 2000 it made its way to
Sistina Software Sistina Software was a US company that focused on storage solutions designed around a Linux platform. It originated in the University of Minnesota. Their three primary offerings were Global File System (GFS), logical volume management (LVM) and de ...
, where it lived for a time as an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
project. In 2001, Sistina made the choice to make GFS a proprietary product. Developers forked OpenGFS from the last public release of GFS and then further enhanced it to include updates allowing it to work with OpenDLM. But OpenGFS and OpenDLM became defunct, since
Red Hat Red Hat, Inc. is an American software company that provides open source software products to enterprises. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina, with other offices worldwide. Red Hat has become ass ...
purchased Sistina in December 2003 and released GFS and many cluster-infrastructure pieces under the
GPL The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general u ...
in late June 2004.
Red Hat Red Hat, Inc. is an American software company that provides open source software products to enterprises. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina, with other offices worldwide. Red Hat has become ass ...
subsequently financed further development geared towards bug-fixing and stabilization. A further development, GFS2 derives from GFS and was included along with its
distributed lock manager Operating systems use lock managers to organise and serialise the access to resources. A distributed lock manager (DLM) runs in every machine in a cluster, with an identical copy of a cluster-wide lock database. In this way a DLM provides software ...
(shared with GFS) in Linux 2.6.19. Red Hat Enterprise Linux 5.2 included GFS2 as a kernel module for evaluation purposes. With the 5.3 update, GFS2 became part of the kernel package. GFS2 forms part of the
Fedora A fedora () is a hat with a soft brim and indented crown.Kilgour, Ruth Edwards (1958). ''A Pageant of Hats Ancient and Modern''. R. M. McBride Company. It is typically creased lengthwise down the crown and "pinched" near the front on both sides ...
,
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial open-source Linux distribution developed by Red Hat for the commercial market. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop version ...
and associated
CentOS CentOS (, from Community Enterprise Operating System; also known as CentOS Linux) is a Linux distribution that provides a free and open-source community-supported computing platform, functionally compatible with its upstream source, Red Hat En ...
Linux distributions. Users can purchase commercial support to run GFS2 fully supported on top of
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial open-source Linux distribution developed by Red Hat for the commercial market. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop version ...
. As of Red Hat Enterprise Linux 8.3, GFS2 is supported in
cloud computing Cloud computing is the on-demand availability of computer system resources, especially data storage ( cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over mul ...
environments in which shared storage devices are available. The following list summarizes some version numbers and major features introduced: * v1.0 (1996)
SGI SGI may refer to: Companies *Saskatchewan Government Insurance *Scientific Games International, a gambling company *Silicon Graphics, Inc., a former manufacturer of high-performance computing products *Silicon Graphics International, formerly Rac ...
IRIX IRIX ( ) is a discontinued operating system developed by Silicon Graphics (SGI) to run on the company's proprietary MIPS workstations and servers. It is based on UNIX System V with BSD extensions. In IRIX, SGI originated the XFS file system and ...
only * v3.0 Linux port * v4 journaling * v5 Redundant Lock Manager * v6.1 (2005)
Distributed Lock Manager Operating systems use lock managers to organise and serialise the access to resources. A distributed lock manager (DLM) runs in every machine in a cluster, with an identical copy of a cluster-wide lock database. In this way a DLM provides software ...

Linux 2.6.19

GFS2 and DLM merged into Linux kernel
* Red Hat Enterprise Linux 5.3 releases the first fully supported GFS2


Hardware

The design of GFS and of GFS2 targets SAN-like environments. Although it is possible to use them as a single node filesystem, the full feature-set requires a SAN. This can take the form of
iSCSI Internet Small Computer Systems Interface or iSCSI ( ) is an Internet Protocol-based storage networking standard for linking data storage facilities. iSCSI provides block-level access to storage devices by carrying SCSI commands over a TCP/IP ...
,
FibreChannel Fibre Channel (FC) is a high-speed data transfer protocol providing in-order, lossless delivery of raw block data. Fibre Channel is primarily used to connect computer data storage to servers in storage area networks (SAN) in commercial data cen ...
, AoE, or any other device which can be presented under
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
as a block device shared by a number of nodes, for example a
DRBD DRBD is a distributed replicated storage system for the Linux platform. It is implemented as a kernel driver, several userspace management applications, and some shell scripts. DRBD is traditionally used in high availability (HA) computer clust ...
device. The DLM requires an IP based network over which to communicate. This is normally just
Ethernet Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
, but again, there are many other possible solutions. Depending upon the choice of SAN, it may be possible to combine this, but normal practice involves separate networks for the DLM and storage. The GFS requires a
fencing Fencing is a group of three related combat sports. The three disciplines in modern fencing are the foil, the épée, and the sabre (also ''saber''); winning points are made through the weapon's contact with an opponent. A fourth discipline, s ...
mechanism of some kind. This is a requirement of the cluster infrastructure, rather than GFS/GFS2 itself, but it is required for all multi-node clusters. The usual options include power switches and remote access controllers (e.g. DRAC, IPMI, or
ILO The International Labour Organization (ILO) is a United Nations agency whose mandate is to advance social and economic justice by setting international labour standards. Founded in October 1919 under the League of Nations, it is the first and ol ...
). Virtual and hypervisor-based fencing mechanisms can also be used.
Fencing Fencing is a group of three related combat sports. The three disciplines in modern fencing are the foil, the épée, and the sabre (also ''saber''); winning points are made through the weapon's contact with an opponent. A fourth discipline, s ...
is used to ensure that a node which the cluster believes to be failed cannot suddenly start working again while another node is recovering the journal for the failed node. It can also optionally restart the failed node automatically once the recovery is complete.


Differences from a local filesystem

Although the designers of GFS/GFS2 aimed to emulate a local filesystem closely, there are a number of differences to be aware of. Some of these are due to the existing filesystem interfaces not allowing the passing of information relating to the cluster. Some stem from the difficulty of implementing those features efficiently in a clustered manner. For example: * The flock() system call on GFS/GFS2 is not interruptible by
signals In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The ''IEEE Transactions on Signal Processing'' ...
. * The fcntl() F_GETLK system call returns a PID of any blocking lock. Since this is a cluster filesystem, that PID might refer to a process on any of the nodes which have the filesystem mounted. Since the purpose of this interface is to allow a signal to be sent to the blocking process, this is no longer possible. * Leases are not supported with the lock_dlm (cluster) lock module, but they are supported when used as a local filesystem *
dnotify dnotify is a file system event monitor for the Linux kernel, one of the subfeatures of the fcntl call. It was introduced in the 2.4 kernel series. It has been obsoleted by inotify, but will be retained for compatibility reasons. Its function is es ...
will work on a "same node" basis, but its use with GFS/GFS2 is not recommended *
inotify inotify ( inode notify) is a Linux kernel subsystem created by John McCutchan, which monitors changes to the filesystem, and reports those changes to applications. It can be used to automatically update directory views, reload configuration files ...
will also work on a "same node" basis, and is also not recommended (but it may become supported in the future) * splice is supported on GFS2 only The other main difference, and one that is shared by all similar cluster filesystems, is that the cache control mechanism, known as glocks (pronounced Gee-locks) for GFS/GFS2, has an effect across the whole cluster. Each
inode The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attribute ...
on the filesystem has two glocks associated with it. One (called the iopen glock) keeps track of which processes have the inode open. The other (the inode glock) controls the cache relating to that inode. A glock has four states, UN (unlocked), SH (shared – a read lock), DF (deferred – a read lock incompatible with SH) and EX (exclusive). Each of the four modes maps directly to a DLM lock mode. When in EX mode, an inode is allowed to cache data and
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
(which might be "dirty", i.e. waiting for write back to the filesystem). In SH mode, the inode can cache data and metadata, but it must not be dirty. In DF mode, the inode is allowed to cache metadata only, and again it must not be dirty. The DF mode is used only for direct I/O. In UN mode, the inode must not cache any metadata. In order that operations which change an inode's data or metadata do not interfere with each other, an EX lock is used. This means that certain operations, such as create/unlink of files from the ''same'' directory and writes to the ''same'' file should be, in general, restricted to one node in the cluster. Of course, doing these operations from multiple nodes will work as expected, but due to the requirement to flush caches frequently, it will not be very efficient. The single most frequently asked question about GFS/GFS2 performance is why the performance can be poor with email servers. The solution is to break up the mail spool into separate directories and to try to keep (so far as is possible) each node reading and writing to a private set of directories.


Journaling

GFS and GFS2 are both
journaled file system A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal (computing), journal", which is usually a circul ...
s; and GFS2 supports a similar set of journaling modes as
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extend ...
. In ''data=writeback'' mode, only metadata is journaled. This is the only mode supported by GFS, however it is possible to turn on journaling on individual data-files, but only when they are of zero size. Journaled files in GFS have a number of restrictions placed upon them, such as no support for the
mmap In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not immediately read from disk and initially use no ...
or sendfile system calls, they also use a different on-disk format from regular files. There is also an "inherit-journal" attribute which when set on a directory causes all files (and sub-directories) created within that directory to have the journal (or inherit-journal, respectively) flag set. This can be used instead of the ''data=journal'' mount option which
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extend ...
supports (and GFS/GFS2 does not). GFS2 also supports ''data=ordered'' mode which is similar to ''data=writeback'' except that dirty data is synced before each journal flush is completed. This ensures that blocks which have been added to an inode will have their content synced back to disk before the metadata is updated to record the new size and thus prevents uninitialised blocks appearing in a file under node failure conditions. The default journaling mode is ''data=ordered'', to match
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extend ...
's default. , GFS2 does not yet support ''data=journal'' mode, but it does (unlike GFS) use the same on-disk format for both regular and journaled files, and it also supports the same journaled and inherit-journal attributes. GFS2 also relaxes the restrictions on when a file may have its journaled attribute changed to any time that the file is not open (also the same as
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extend ...
). For performance reasons, each node in GFS and GFS2 has its own journal. In GFS the journals are disk extents, in GFS2 the journals are just regular files. The number of nodes which may mount the filesystem at any one time is limited by the number of available journals.


Features of GFS2 compared with GFS

GFS2 adds a number of new features which are not in GFS. Here is a summary of those features not already mentioned in the boxes to the right of this page: * The metadata filesystem (really a different root) – see Compatibility and the GFS2 meta filesystem below * GFS2 specific trace points have been available since kernel 2.6.32 * The XFS-style quota interface has been available in GFS2 since kernel 2.6.33 * Caching ACLs have been available in GFS2 since 2.6.33 * GFS2 supports the generation of "discard" requests for thin provisioning/SCSI TRIM requests * GFS2 supports I/O barriers (on by default, assuming underlying device supports it. Configurable from kernel 2.6.33 and up) * FIEMAP ioctl (to query mappings of inodes on disk) *
Splice (system call) is a Linux-specific system call that moves data between a file descriptor and a pipe without a round trip to user space. The related system call moves or copies data between a pipe and user space. Ideally, splice and vmsplice work by remapping ...
support * mmap/splice support for journaled files (enabled by using the same on disk format as for regular files) * Far fewer tunables (making set-up less complicated) * Ordered write mode (as per ext3, GFS only has writeback mode)


Compatibility and the GFS2 meta filesystem

GFS2 was designed so that upgrading from GFS would be a simple procedure. To this end, most of the on-disk structure has remained the same as GFS, including the
big-endian In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most sig ...
byte ordering. There are a few differences though: * GFS2 has a "meta filesystem" through which processes access system files * GFS2 uses the same on-disk format for journaled files as for regular files * GFS2 uses regular (system) files for journals, whereas GFS uses special extents * GFS2 has some other "" system files * The layout of the
inode The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attribute ...
is (very slightly) different * The layout of indirect blocks differs slightly The journaling systems of GFS and GFS2 are not compatible with each other. Upgrading is possible by means of a tool () which is run with the filesystem off-line to update the metadata. Some spare blocks in the GFS journals are used to create the (very small) files required by GFS2 during the update process. Most of the data remains in place. The GFS2 "meta filesystem" is not a filesystem in its own right, but an alternate
root In vascular plants, the roots are the organs of a plant that are modified to provide anchorage for the plant and take in water and nutrients into the plant body, which allows plants to grow taller and faster. They are most often below the sur ...
of the main filesystem. Although it behaves like a "normal" filesystem, its contents are the various system files used by GFS2, and normally users do not need to ever look at it. The GFS2 utilities
mount Mount is often used as part of the name of specific mountains, e.g. Mount Everest. Mount or Mounts may also refer to: Places * Mount, Cornwall, a village in Warleggan parish, England * Mount, Perranzabuloe, a hamlet in Perranzabuloe parish, C ...
and unmount the meta filesystem as required, behind the scenes.


See also

*
Comparison of file systems The following tables compare general and technical information for a number of file systems. General information Limits Metadata Features File capabilities Block capabilities Note that in addition to the below table, blo ...
*
GPFS GPFS (General Parallel File System, brand name IBM Spectrum Scale) is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It ...
,
ZFS ZFS (previously: Zettabyte File System) is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an ope ...
,
VxFS The VERITAS File System (or VxFS; called JFS and OnlineJFS in HP-UX) is an extent-based file system. It was originally developed by VERITAS Software. Through an OEM agreement, VxFS is used as the primary filesystem of the HP-UX operating s ...
*
Lustre Lustre or Luster may refer to: Places * Luster, Norway, a municipality in Vestlandet, Norway ** Luster (village), a village in the municipality of Luster * Lustre, Montana, an unincorporated community in the United States Entertainment * '' ...
*
GlusterFS Gluster Inc. (formerly known as Z RESEARCH) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineer ...
*
List of file systems The following lists identify, characterize, and link to more thorough information on Computer file systems. Many older operating systems support only their one "native" file system, which does not bear any name apart from the name of the operating ...
*
OCFS2 The Oracle Cluster File System (OCFS, in its second version OCFS2) is a shared disk file system developed by Oracle Corporation and released under the GNU General Public License. The first version of OCFS was developed with the main focus to accom ...
*
QFS QFS (Quick File System) is a filesystem from Oracle. It is tightly integrated with SAM, the Storage and Archive Manager, and hence is often referred to as SAM-QFS. SAM provides the functionality of a hierarchical storage manager. Features QFS ...
*
SAN file system The SAN File System (SFS) is a high-performance, clustered file system created by the company DataPlow. SFS enables fast access to shared files located on shared, storage area network (SAN)-attached storage devices. SFS utilizes the high-speed, ...
*
Fencing Fencing is a group of three related combat sports. The three disciplines in modern fencing are the foil, the épée, and the sabre (also ''saber''); winning points are made through the weapon's contact with an opponent. A fourth discipline, s ...
* Open-Sharedroot *
Ceph (software) Ceph (pronounced ) is an open-source software-defined storage platform that implements object storage on a single distributed computer cluster and provides 3-in-1 interfaces for object-, block- and file-level storage. Ceph aims primarily f ...


References


External links

* Red Ha
Red Hat Enterprise Linux 6 - Global File System 2
* Red Ha
Cluster Suite and GFS Documentation Page

GFS Project Page

OpenGFS Project Page
(obsolete)
The GFS2 development git tree

The GFS2 utilities development git tree
{{Filesystem Distributed file systems supported by the Linux kernel Red Hat software Shared disk file systems University of Minnesota software Virtualization software for Linux