Filesystem API
   HOME

TheInfoList



OR:

A
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
API is an
application programming interface An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
through which a utility or user program requests services of a file system. An operating system may provide abstractions for accessing different file systems transparently. Some file system APIs may also include interfaces for maintenance operations, such as creating or initializing a file system, verifying the file system for integrity, and
defragmentation In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contigu ...
. Each operating system includes the APIs needed for the file systems it supports.
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
has file system APIs for
NTFS New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred fil ...
and several FAT file systems.
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
systems can include APIs for
ext2 The ext2 or second extended file system is a file system for the Linux kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed according to the same ...
,
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on ext ...
,
ReiserFS ReiserFS is a general-purpose, journaling file system initially designed and implemented by a team at Namesys led by Hans Reiser and licensed under GPLv2. Introduced in version 2.4.1 of the Linux kernel, it was the first journaling file sys ...
, and
Btrfs Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or simply by spelling it out) is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (not to be confused ...
to name a few.


History

Some early operating systems were capable of handling only tape and disk
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
s. These provided the most basic of interfaces with: * Write, read and position More coordination such as device allocation and deallocation required the addition of: * Open and close As file systems provided more services, more interfaces were defined: * Metadata management * File system maintenance As additional file system types, hierarchy structure and supported media increased, additional features needed some specialized functions: * Directory management * Data structure management * Record management * Non-data operations Multi-user systems required APIs for: * Sharing * Restricting access * Encryption


API overviews


Write, read and position

Writing user data to a file system is provided for use directly by the user program or the run-time library. The run-time library for some programming languages may provide type conversion, formatting and blocking. Some file systems provide identification of records by key and may include re-writing an existing record. This operation is sometimes called

or PUTX (if the record exists) Reading user data, sometimes called
GET
may include a direction (forward or reverse) or in the case of a keyed file system, a specific key. As with writing run-time libraries may intercede for the user program. Positioning includes adjusting the location of the next record. This may include skipping forward or reverse as well as positioning to the beginning or end of the file.


Open and close

The Open (system call), open API may be explicitly requested or implicitly invoked upon the issuance of the first operation by a process on an object. It may cause the mounting of removable media, establishing a connection to another host and validating the location and accessibility of the object. It updates system structures to indicate that the object is in use. Usual requirements for requesting access to a file system object include: # The object which is to be accessed (file, directory, media and location) # The intended type of operations to be performed after the open (reads, updates, deletions) Additional information may be necessary, for example * a password * a declaration that other processes may access the same object while the opening process is using the object (sharing). This may depend on the intent of the other process. In contrast, a declaration that no other process may access the object regardless of the other processes intent (exclusive use). These are requested via a programming language library which may provide coordination among modules in the process in addition to forwarding the request to the file system. It must be expected that something may go wrong during the processing of the open. # The object or intent may be improperly specified (the name may include an unacceptable character or the intent is unrecognized). # The process may be prohibited from accessing the object (it may be only accessible by a group or specific user). # The file system may be unable to create or update structures required to coordinate activities among users. # In the case of a new (or replacement) object, there may not be sufficient capacity on the media. Depending on the programming language, additional specifications in the open may establish the modules to handle these conditions. Some libraries specify a library module to the file system permitting analysis should the opening program be unable to perform any meaningful action as a result of a failure. For example, if the failure is on the attempt to open the necessary input file, the only action may be to report the failure and abort the program. Some languages simply return a code indicating the type of failure which always must be checked by the program, which decides what to report and if it can continue.
Close Close may refer to: Music * ''Close'' (Kim Wilde album), 1988 * ''Close'' (Marvin Sapp album), 2017 * ''Close'' (Sean Bonniwell album), 1969 * "Close" (Sub Focus song), 2014 * "Close" (Nick Jonas song), 2016 * "Close" (Rae Sremmurd song), 201 ...
may cause dismounting or ejecting removable media and updating library and file system structures to indicate that the object is no longer in use. The minimal specification to the close references the object. Additionally, some file systems provide specifying a disposition of the object which may indicate the object is to be discarded and no longer be part of the file system. Similar to the open, it must be expected that something may go wrong. # The specification of the object may be incorrect. # There may not be sufficient capacity on the media to save any data being buffered or to output a structure indicating that the object was successfully updated. # A device error may occur on the media where the object is stored while writing buffered data, the completion structure or updating meta data related to the object (for example last access time). # A specification to release the object may be inconsistent with other processes still using the object. Considerations for handling a failure are similar to those of the open.


Metadata management

Information about the data in a file is called metadata. Some of the metadata is maintained by the file system, for example last-modification date (and various other dates depending on the file system), location of the beginning of the file, the size of the file and if the file system backup utility has saved the current version of the files. These items cannot usually be altered by a user program. Additional meta data supported by some file systems may include the owner of the file, the group to which the file belongs as well as permissions and/or access control (i.e. What access and updates various users or groups may perform), and whether the file is normally visible when the directory is listed. These items are usually modifiable by file system utilities which may be executed by the owner. Some applications store more metadata. For images the metadata may include the camera model and settings used to take the photo. For audio files, the meta data may include the album, artist who recorded the recording and comments about the recording which may be specific to a particular copy of the file (i.e. different copies of the same recording may have different comments as update by the owner of the file). Documents may include items like checked-by, approved-by, etc.


Directory management

Renaming a file, moving a file (or a subdirectory) from one directory to another and deleting a file are examples of the operations provide by the file system for the management of directories. Metadata operations such as permitting or restricting access the a directory by various users or groups of users are usually included.


Filesystem maintenance

As a filesystem is used directories, files and records may be added, deleted or modified. This usually causes inefficiencies in the underlying data structures. Things like logically sequential blocks distributed across the media in a way that causes excessive repositioning, partially used even empty blocks included in linked structures. Incomplete structures or other inconsistencies may be caused by device or media errors, inadequate time between detection of impending loss of power and actual power loss, improper system shutdown or media removal, and on very rare occasions file system coding errors. Specialized routines in the file system are included to optimize or repair these structures. They are not usually invoked by the user directly but triggered within the file system itself. Internal counters of the number of levels of structures, number of inserted objects may be compared against thresholds. These may cause user access to be suspended to a specific structure (usually to the displeasure of the user or users effected) or may be started as low priority asynchronous tasks or they may be deferred to a time of low user activity. Sometimes these routines are invoked or scheduled by the system manager or as in the case of
defragmentation In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contigu ...
.


Kernel-level API

The API is "kernel-level" when the kernel not only provides the interfaces for the filesystems developers but is also the space in which the filesystem code resides. It differs with the old schema in that the kernel itself uses its own facilities to talk with the filesystem driver and vice versa, as contrary to the kernel being the one that handles the filesystem layout and the filesystem the one that directly access the hardware. It is not the cleanest scheme but resolves the difficulties of major rewrite that has the old scheme. With modular kernels it allows adding filesystems as any kernel module, even third party ones. With non-modular kernels however it requires the kernel to be recompiled with the new filesystem code (and in closed-source kernels, this makes third party filesystem impossible).
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
es and
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
systems such as
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
have used this modular scheme. There is a variation of this scheme used in
MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few o ...
(DOS 4.0 onward) and compatibles to support CD-ROM and network file systems. Instead of adding code to the kernel, as in the old scheme, or using kernel facilities as in the kernel-based scheme, it traps all calls to a file and identifies if it should be redirected to the kernel's equivalent function or if it has to be handled by the specific filesystem driver, and the filesystem driver "directly" access the disk contents using low-level
BIOS In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is firmware used to provide runtime services for operating systems and programs and to perform hardware initialization during the b ...
functions.


Driver-based API

The API is "driver-based" when the kernel provides facilities but the file system code resides totally external to the kernel (not even as a module of a modular kernel). It is a cleaner scheme as the filesystem code is totally independent, it allows filesystems to be created for closed-source kernels and online filesystem additions or removals from the system. Examples of this scheme are the
Windows NT Windows NT is a proprietary graphical operating system produced by Microsoft, the first version of which was released on July 27, 1993. It is a processor-independent, multiprocessing and multi-user operating system. The first version of Win ...
and
OS/2 OS/2 (Operating System/2) is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 r ...
respective IFSs.


Mixed kernel-driver-based API

In this API all filesystems are in the kernel, like in kernel-based APIs, but they are automatically trapped by another API, that is driver-based, by the OS. This scheme was used in
Windows 3.1 Windows 3.1 is a major release of Microsoft Windows. It was released to manufacturing on April 6, 1992, as a successor to Windows 3.0. Like its predecessors, the Windows 3.1 series ran as a shell on top of MS-DOS. Codenamed Janus, Windows ...
for providing a FAT filesystem driver in 32-bit protected mode, and cached, (VFAT) that bypassed the DOS FAT driver in the kernel (MSDOS.SYS) completely, and later in the Windows 9x series ( 95, 98 and Me) for VFAT, the ISO9660 filesystem driver (along with Joliet), network shares, and third party filesystem drivers, as well as adding to the original DOS APIs the LFN API (that IFS drivers can not only intercept the already existent DOS file APIs but also add new ones from within the 32-bit protected mode executable). However that API was not completely documented, and third parties found themselves in a "make-it-by-yourself" scenario even worse than with kernel-based APIs.


User space API

The API is in the
user space A modern computer operating system usually segregates virtual memory into user space and kernel space. Primarily, this separation serves to provide memory protection and hardware protection from malicious or errant software behaviour. Kernel ...
when the filesystem does not directly use kernel facilities but accesses disks using high-level operating system functions and provides functions in a
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vi ...
that a series of utilities use to access the filesystem. This is useful for handling disk images. The advantage is that a filesystem can be made portable between operating systems as the high-level operating system functions it uses can be as common as ANSI C, but the disadvantage is that the API is unique to each application that implements one. Examples of this scheme are th
hfsutils
and th
adflib


Interoperatibility between file system APIs

As all filesystems (at least the disk ones) need equivalent functions provided by the kernel, it is possible to easily port a filesystem code from one API to another, even if they are of different types. For example, the ext2 driver for OS/2 is simply a wrapper from the Linux's VFS to the OS/2's IFS and the Linux's ext2 kernel-based, and the HFS driver for OS/2 is a port of the hfsutils to the OS/2's IFS. There also exists a project that uses a Windows NT IFS driver for making NTFS work under Linux.


See also

*
Comparison of file systems The following tables compare general and technical information for a number of file systems. General information Limits Metadata Features File capabilities Block capabilities Note that in addition to the below table, blo ...
*
File system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
*
Filename extension A filename extension, file name extension or file extension is a suffix to the name of a computer file (e.g., .txt, .docx, .md). The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically ...
* Filing Open Service Interface Definition (OSID) *
Installable File System The Installable File System (IFS) is a filesystem API in MS-DOS/PC DOS 4.x, IBM OS/2 and Microsoft Windows that enables the operating system to recognize and load drivers for file systems. History When IBM and Microsoft were co-developing OS ...
(IFS) *
List of file systems The following lists identify, characterize, and link to more thorough information on Computer file systems. Many older operating systems support only their one "native" file system, which does not bear any name apart from the name of the operating ...
*
Virtual file system A virtual file system (VFS) or virtual filesystem switch is an abstract layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way. A VFS ...


References


Sources

* O'Reilly - Windows NT File System Internals, A Developer's Guide - By Rajeev Nagar - * Microsoft Press - Inside Windows NT File System - By Helen Custer - * Wiley - UNIX Filesystems: Evolution, Design, and Implementation - By Steve D. Pate - * Microsoft Press - Inside Windows NT - By Helen Custer -


External links


Filesystem Specifications and Technical Whitepapers



Microsoft's IFSKit

hfsutils

adflibA FileSystem Abstraction System for Go
{{File systems Application programming interfaces Computer file systems