I-node
   HOME

TheInfoList



OR:

An inode (index node) is a
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attributes may include
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
(times of last change, access, modification), as well as owner and permission data. A directory is a list of inodes with their assigned names. The list includes an entry for itself, its parent, and each of its children.


Etymology

There has been uncertainty on the
Linux kernel mailing list The Linux kernel mailing list (LKML) is the main electronic mailing list for Linux kernel development, where the majority of the announcements, discussions, debates, and flame wars over the kernel take place. Many other mailing lists exist to d ...
about the reason for the "i" in "inode". In 2002, the question was brought to Unix pioneer
Dennis Ritchie Dennis MacAlistair Ritchie (September 9, 1941 – October 12, 2011) was an American computer scientist. He created the C programming language and the Unix operating system and B language with long-time colleague Ken Thompson. Ritchie and Thomp ...
, who replied: A 1978 paper by Ritchie and
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B (programmi ...
bolsters the notion of "index" being the etymological origin of inodes. They wrote: Additionally, Maurice J. Bach wrote that the word ''inode'' "is a contraction of the term index node and is commonly used in literature on the UNIX system".


Details

A file system relies on data structures ''about'' the files, as opposed to the contents of that file. The former are called ''
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
''—data that describes data. Each file is associated with an ''inode'', which is identified by an integer, often referred to as an ''i-number'' or ''inode number''. Inodes store information about files and directories (folders), such as file ownership, access mode (read, write, execute permissions), and file type. The data may be called stat data, in reference to the stat
system call In computing, a system call (syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive ...
that provides the data to programs. The inode number indexes a table of inodes on the file system. From the inode number, the kernel's file system driver can access the inode contents, including the location of the file, thereby allowing access to the file. A file's inode number can be found using the ls -i command. The ls -i command prints the inode number in the first column of the report. On many older file systems, inodes are stored in one or more fixed-size areas that are set up at file system creation time, so the maximum number of inodes is fixed at file system creation, limiting the maximum number of files the file system can hold. A typical allocation heuristic for inodes in a file system is one inode for every 2K bytes contained in the filesystem. Some Unix-style file systems such as JFS,
XFS XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; a ...
,
ZFS ZFS (previously Zettabyte File System) is a file system with Volume manager, volume management capabilities. It began as part of the Sun Microsystems Solaris (operating system), Solaris operating system in 2001. Large parts of Solaris, includin ...
,
OpenZFS OpenZFS is an open-source implementation of the ZFS file system and volume manager initially developed by Sun Microsystems for the Solaris operating system, and is now maintained by the OpenZFS Project. Similar to the original ZFS, the impleme ...
,
ReiserFS ReiserFS is a general-purpose, journaling file system initially designed and implemented by a team at Namesys led by Hans Reiser and licensed under GPLv2. Introduced in version 2.4.1 of the Linux kernel, it was the first journaling file syst ...
,
btrfs Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or "B.T.R.F.S.") is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (distinct from Linux's LVM), d ...
, and
APFS Apple File System (APFS) is a proprietary file system developed and deployed by Apple Inc. for macOS Sierra (10.12.4) and later, iOS 10.3, tvOS 10.2, watchOS 3.2, and all versions of iPadOS. It aims to fix core problems of HFS+ (also ca ...
omit a fixed-size inode table, but must store equivalent data in order to provide equivalent capabilities. Common alternatives to the fixed-size table include
B-tree In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing fo ...
s and the derived
B+ tree A B+ tree is an m-ary tree with a variable but often large number of children per node. A B+ tree consists of a root, internal nodes and leaves. The root may be either a leaf or a node with two or more children. A B+ tree can be viewed as a B ...
s. File names and directory implications: * Inodes do not contain their
hard link In computing, a hard link is a directory entry (in a Directory (computing), directory-based file system) that associates a name with a Computer file, file. Thus, each file must have at least one hard link. Creating additional hard links for a fil ...
names, only other file metadata. * Unix directories are lists of association structures, each of which contains one filename and one inode number. * The file system driver must search a directory for a particular filename and then convert the filename to the correct corresponding inode number. The operating system kernel's in-memory representation of this data is called struct inode in
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
. Systems derived from
BSD The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
use the term vnode (the "v" refers to the kernel's
virtual file system A virtual file system (VFS) or virtual filesystem switch is an abstract layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way. A VFS ...
layer).


POSIX inode description

The
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
standard mandates file-system behavior that is strongly influenced by traditional
UNIX Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
file systems. An inode is denoted by the phrase "file serial number", defined as a ''per-file system'' unique identifier for a file. That file serial number, together with the device ID of the device containing the file, uniquely identify the file within the whole system. Within a POSIX system, a file has the following attributes which may be retrieved by the stat system call: * Device ID (this identifies the device containing the file; that is, the scope of uniqueness of the serial number). * File serial numbers. * The file ''mode'' which determines the file type and how the file's owner, its group, and how others can access the file. * A link count telling how many
hard link In computing, a hard link is a directory entry (in a Directory (computing), directory-based file system) that associates a name with a Computer file, file. Thus, each file must have at least one hard link. Creating additional hard links for a fil ...
s point to the inode. * The
User ID Unix-like operating systems identify a user by a value called a user identifier, often abbreviated to user ID or UID. The UID, along with the group identifier (GID) and other access control criteria, is used to determine which system resources a us ...
of the file's owner. * The
Group ID In Unix-like systems, multiple users can be put into '' groups''. POSIX and conventional Unix file system permissions are organized into three classes, ''user'', ''group'', and ''others''. The use of groups allows additional abilities to be dele ...
of the file. * The device ID of the file if it is a
device file In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These s ...
. * The size of the file in
byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
s. *
Timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolu ...
s telling when the inode itself was last modified (, ''inode change time''), the file content last modified (, ''modification time''), and last accessed (, ''access time''). * The preferred I/O block size. * The number of blocks allocated to this file.


Implications

Filesystems designed with inodes will have the following administrative characteristics:


Multi-named files and hard links

Files can have multiple names. If multiple names hard link to the same inode then the names are equivalent; i.e., the first to be created has no special status. This is unlike
symbolic link In computing, a symbolic link (also symlink or soft link) is a file whose purpose is to point to a file or directory (called the "target") by specifying a path thereto. Symbolic links are supported by POSIX and by most Unix-like operating syste ...
s, which depend on the original name, not the inode (number).


inode persistence and unlinked files

An inode may have no links. An inode without links represents a file with no remaining directory entries or paths leading to it in the filesystem. A file that has been deleted or lacks directory entries pointing to it is termed an 'unlinked' file. Such files are removed from the filesystem, freeing the occupied disk space for reuse. An inode without links remains in the filesystem until the resources (disk space and blocks) freed by the unlinked file are deallocated or the file system is modified. Although an unlinked file becomes invisible in the filesystem, its deletion is deferred until all processes with access to the file have finished using it, including executable files which are implicitly held open by the processes executing them.


inode number conversion and file directory path retrieval

It is typically not possible to map from an open file to the filename that was used to open it. When a program opens a file, the operating system converts the filename to an inode number and then discards the filename. As a result, functions like and which retrieve the current
working directory In computing, the working directory of a process is a directory of a hierarchical file system, if any, dynamically associated with the process. It is sometimes called the current working directory (CWD), e.g. the BSD getcwd function, or just c ...
of the process, cannot directly access the filename. Beginning with the current directory, these functions search up to its
parent directory In computing, a directory is a file system cataloging structure that contains references to other computer files, and possibly other directories. On many computers, directories are known as folders or drawers, analogous to a workbench or the t ...
, then to the parent's parent, and so on, until reaching the
root directory In a Computing, computer file system, and primarily used in the Unix and Unix-like operating systems, the root directory is the first or top-most Directory (computing), directory in a hierarchy. It can be likened to the trunk of a Tree (data st ...
. At each level, the function looks for a directory entry whose inode matches that of the directory it just moved up from. Because the child directory's inode still exists as an entry in its
parent directory In computing, a directory is a file system cataloging structure that contains references to other computer files, and possibly other directories. On many computers, directories are known as folders or drawers, analogous to a workbench or the t ...
, it allows the function to reconstruct the
absolute path A path (or filepath, file path, pathname, or similar) is a text string that uniquely specifies an item in a hierarchical file system. Generally, a path is composed of directory names, special directory specifiers and optionally a filename, separ ...
of the current
working directory In computing, the working directory of a process is a directory of a hierarchical file system, if any, dynamically associated with the process. It is sometimes called the current working directory (CWD), e.g. the BSD getcwd function, or just c ...
. Some operating systems maintain extra information to make this operation run faster. For example, in the
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
VFS, directory entry cache, also known as dentry or dcache, are cache entries used by the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learnin ...
to speed up filesystem operations by storing information about directory links in
RAM Ram, ram, or RAM most commonly refers to: * A male sheep * Random-access memory, computer memory * Ram Trucks, US, since 2009 ** List of vehicles named Dodge Ram, trucks and vans ** Ram Pickup, produced by Ram Trucks Ram, ram, or RAM may also ref ...
.


Historical possibility of directory hard linking

Historically, it was possible to
hard link In computing, a hard link is a directory entry (in a Directory (computing), directory-based file system) that associates a name with a Computer file, file. Thus, each file must have at least one hard link. Creating additional hard links for a fil ...
directories. This made the directory structure an arbitrary directed graph contrary to a
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
. It was even possible for a directory to be its own parent. Modern systems generally prohibit this confusing state, except that the parent of ''root'' is still defined as root. The most notable exception to this prohibition is found in
Mac OS X macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
(versions 10.5 and higher) which allows hard links of directories to be created on
HFS+ HFS Plus or HFS+ (also known as Mac OS Extended or HFS Extended) is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8. ...
file systems by the superuser.


inode number stability and non-Unix file systems

When a file is relocated to a different directory on the same file system, or when a disk
defragmentation In the maintenance of file systems, defragmentation is a process that reduces the degree of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous ...
alters its physical location, the file's inode number remains unchanged. This unique characteristic permits the file to be moved or renamed even during read or write operations, thereby ensuring continuous access without disruptions. This feature—having a file's metadata and
data block In computing (specifically data transmission and data storage), a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a fixed length; a ''block size''. Data thus s ...
locations persist in a central
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
, irrespective of file renaming or moving—cannot be fully replicated in many non-Unix file systems like
FAT In nutrition science, nutrition, biology, and chemistry, fat usually means any ester of fatty acids, or a mixture of such chemical compound, compounds, most commonly those that occur in living beings or in food. The term often refers specif ...
and its derivatives, as they lack a mechanism to maintain this invariant property when both the file's directory entry and its data are simultaneously relocated. In these file systems, moving or renaming a file might lead to more significant changes in the data structure representing the file, and the system does not keep a separate, central record of the file's
data block In computing (specifically data transmission and data storage), a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a fixed length; a ''block size''. Data thus s ...
locations and
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
as inodes do in
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
systems.


Simplified library installation with inode file systems

inode file systems allow a running process to continue accessing a library file even as another process is replacing that same file. This operation should be performed atomically, meaning it should appear as a single operation that is either entirely completed or not done at all, with no intermediate state visible to other processes. During the replacement, a new inode is created for the new library file, establishing an entirely new mapping. Subsequently, future access requests for that library will retrieve the newly installed version. When the operating system is replacing the file (and creating a new inode), it places a lock on the inode and possibly the containing directory. This prevents other processes from reading or writing to the file (inode) during the update operation, thereby avoiding data inconsistency or corruption. Once the update operation is complete, the lock is released. Any subsequent access to the file (via the inode) by any processes will now point to the new version of the library. Thus, making it possible to perform updates even when the library is in use by another process. One significant advantage of this mechanism is that it eliminates the need for a system reboot to replace libraries currently in use. Consequently, systems can update or upgrade
software libraries In computing, a library is a collection of resources that can be leveraged during software development to implement a computer program. Commonly, a library consists of executable code such as compiled functions and classes, or a library can ...
seamlessly without interrupting running processes or operations.


Potential for inode exhaustion and solutions

When a file system is created, some file systems allocate a fixed number of inodes. This means that it is possible to run out of inodes on a file system, even if there is free space remaining in the file system. This situation often arises in use cases where there are many small files, such as on a server storing email messages, because each file, no matter how small, requires its own inode. Other file systems avoid this limitation by using dynamic inode allocation. Dynamic inode allocation allows a file system to create more inodes as needed instead of relying on a fixed number created at the time of file system creation. This can "grow" the file system by increasing the number of inodes available for new files and directories, thus avoiding the problem of running out of inodes.


Inlining

It can make sense to store very small files in the inode itself to save both space (no data block needed) and lookup time (no further disk access needed). This file system feature is called inlining. The strict separation of inode and file data thus can no longer be assumed when using modern file systems. If the data of a file fits in the space allocated for pointers to the data, this space can conveniently be used. For example,
ext2 ext2, or second extended file system, is a file system for the Linux kernel (operating system), kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed ...
and its successors store the data of symlinks (typically file names) in this way if the data is no more than 60 bytes ("fast symbolic links").
Ext4 ext4 (fourth extended filesystem) is a journaling file system for Linux, developed as the successor to ext3. ext4 was initially a series of backward-compatible extensions to ext3, many of them originally developed by Cluster File Systems for ...
has a file system option called inline_data that allows ext4 to perform inlining if enabled during file system creation. Because an inode's size is limited, this only works for very small files.


In non-Unix systems

*
NTFS NT File System (NTFS) (commonly called ''New Technology File System'') is a proprietary journaling file system developed by Microsoft in the 1990s. It was developed to overcome scalability, security and other limitations with File Allocation Tabl ...
has a master file table (MFT) storing files in a B-tree. Each entry has a "fileID", analogous to the inode number, that uniquely refers to this entry. The three timestamps, a device ID, attributes, reference count, and file sizes are found in the entry, but unlike in POSIX the permissions are expressed through a different API. The on-disk layout is more complex. The earlier FAT file systems did not have such a table and were incapable of making hard links. ** NTFS also has a concept of inlining small files into the MFT entry. ** The derived
ReFS Resilient File System (ReFS), codenamed "Protogon", is a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the "next generation" file system after NTFS. ReFS was designed to overcome problem ...
has a homologous MFT. ReFS has a 128-bit file ID; this extension was also backported to NTFS, which originally had a 64-bit file ID. * The same stat-like API can be used on
Cluster Shared Volumes Cluster Shared Volumes (CSV) is a feature of Failover Clustering first introduced in Windows Server 2008 R2 for use with the Hyper-V role. A Cluster Shared Volume is a shared disk containing an NTFS or ReFS (ReFS: Windows Server 2012 R2 or newer) v ...
, so it presumably has a similar concept of a file ID.


See also

*
inode pointer structure The inode pointer structure is a structure adopted by the inode of a file in the Version 6 Unix file system, Version 7 Unix file system, and Unix File System (UFS) to list the addresses of a file's data blocks. It is also adopted by many relate ...
*
inotify inotify (inode notify) is a Linux kernel subsystem created by John McCutchan, which monitors changes to the filesystem, and reports those changes to applications. It can be used to automatically update directory views, reload configuration files, ...


References


External links


Anatomy of the Linux File System




{{File systems Unix file system technology