E2compr
   HOME

TheInfoList



OR:

The ext2 or second extended file system is a
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
for the
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learnin ...
. It was initially designed by French software developer
Rémy Card Rémy Card is a French software developer who is credited as one of the primary developers of the Extended file system (ext) and Second Extended file system (ext2) for Linux. References Bibliography * Card, Rémy. (1997) ''Programmation Linux 2 ...
as a replacement for the
extended file system The extended file system, or ext, was implemented in April 1992 as the first file system created specifically for the Linux kernel. It has metadata structure inspired by traditional Unix filesystem principles, and was designed by Rémy Card to ove ...
(ext). Having been designed according to the same principles as the
Berkeley Fast File System The Unix file system (UFS) is a family of file systems supported by many Unix and Unix-like operating systems. It is a distant descendant of the original filesystem used by Version 7 Unix. Design A UFS volume is composed of the following ...
from
BSD The Berkeley Software Distribution or Berkeley Standard Distribution (BSD) is a discontinued operating system based on Research Unix, developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berk ...
, it was the first commercial-grade filesystem for Linux. The canonical implementation of ext2 is the "ext2fs" filesystem driver in the Linux kernel. Other implementations (of varying quality and completeness) exist in
GNU Hurd GNU Hurd is a collection of microkernel servers written as part of GNU, for the GNU Mach microkernel. It has been under development since 1990 by the GNU Project of the Free Software Foundation, designed as a replacement for the Unix kernel, and ...
,
MINIX 3 Minix 3 is a small, Unix-like operating system. It is published under a BSD-3-Clause license and is a successor project to the earlier versions, Minix 1 and 2. The project's main goal is for the system to be fault-tolerant by detecting and rep ...
, some
BSD The Berkeley Software Distribution or Berkeley Standard Distribution (BSD) is a discontinued operating system based on Research Unix, developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berk ...
kernels, in
MiNT MiNT is Now TOS (MiNT) is a free software alternative operating system kernel for the Atari ST system and its successors. It is a multi-tasking alternative to TOS and MagiC. Together with the free system components fVDI device drivers, XaA ...
,
Haiku is a type of short form poetry originally from Japan. Traditional Japanese haiku consist of three phrases that contain a ''kireji'', or "cutting word", 17 '' on'' (phonetic units similar to syllables) in a 5, 7, 5 pattern, and a ''kigo'', or se ...
and as third-party
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
and
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
drivers. ext2 was the default filesystem in several
Linux distribution A Linux distribution (often abbreviated as distro) is an operating system made from a software collection that includes the Linux kernel and, often, a package management system. Linux users usually obtain their operating system by downloading one ...
s, including
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
and
Red Hat Linux Red Hat Linux was a widely used Commercial software, commercial Open-source software, open-source Linux distribution created by Red Hat until its discontinuation in 2004. Early releases of Red Hat Linux were called Red Hat Commercial Linux. R ...
, until supplanted by
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extend ...
, which is almost completely compatible with ext2 and is a
journaling file system A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal", which is usually a circular log. In the even ...
. ext2 is still the filesystem of choice for
flash Flash, flashes, or FLASH may refer to: Arts, entertainment, and media Fictional aliases * Flash (DC Comics character), several DC Comics superheroes with super speed: ** Flash (Barry Allen) ** Flash (Jay Garrick) ** Wally West, the first Kid ...
-based storage media (such as
SD card Secure Digital, officially abbreviated as SD, is a proprietary non-volatile flash memory card format developed by the SD Association (SDA) for use in portable devices. The standard was introduced in August 1999 by joint efforts between SanDis ...
s and
USB flash drive A USB flash drive (also called a thumb drive) is a data storage device that includes flash memory with an integrated USB interface. It is typically removable, rewritable and much smaller than an optical disc. Most weigh less than . Since firs ...
s) because its lack of a journal increases performance and minimizes the number of writes, and flash devices can endure a limited number of write cycles. Since 2009, the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
supports a journal-less mode of
ext4 ext4 (fourth extended filesystem) is a journaling file system for Linux, developed as the successor to ext3. ext4 was initially a series of backward-compatible extensions to ext3, many of them originally developed by Cluster File Systems for ...
which provides benefits not found with ext2, such as larger file and volume sizes.


History

The early development of the Linux kernel was made as a cross-development under the MINIX operating system. The
MINIX file system The Minix file system is the native file system of the Minix operating system. It was written from scratch by Andrew S. Tanenbaum in the 1980s and aimed to replicate the structure of the Unix File System while omitting complex features, and was ...
was used as Linux's first file system. The Minix file system was mostly free of bugs, but used 16-bit offsets internally and thus had a maximum size limit of only 64
megabyte The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix ''mega'' is a multiplier of (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes o ...
s, and there was also a filename length limit of 14 characters. Because of these limitations, work began on a replacement native file system for Linux. To ease the addition of new file systems and provide a generic file
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standa ...
, VFS, a virtual file system layer, was added to the Linux kernel. The extended file system (
ext Ext, ext or EXT may refer to: * Ext functor, used in the mathematical field of homological algebra * Ext (JavaScript library), a programming library used to build interactive web applications * Exeter Airport (IATA airport code), in Devon, England ...
), was released in April 1992 as the first file system using the VFS API and was included in Linux version 0.96c. The ext file system solved the two major problems in the Minix file system (maximum partition size and filename length limitation to 14 characters), and allowed 2
gigabyte The gigabyte () is a multiple of the unit byte for digital information. The prefix ''giga'' means 109 in the International System of Units (SI). Therefore, one gigabyte is one billion bytes. The unit symbol for the gigabyte is GB. This defini ...
s of data and filenames of up to 255 characters. But it still had problems: there was no support of separate
timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
s for file access,
inode The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attribute ...
modification, and data modification. As a solution for these problems, two new filesystems were developed in January 1993 for Linux kernel 0.99:
xiafs Xiafs was a file system for the Linux kernel which was conceived and developed by Frank Xia and was based on the MINIX file system. Today it is obsolete and not in use, except possibly in some historic installations. History Linux originally used ...
and the second extended file system (ext2), which was an overhaul of the extended file system incorporating many ideas from the
Berkeley Fast File System The Unix file system (UFS) is a family of file systems supported by many Unix and Unix-like operating systems. It is a distant descendant of the original filesystem used by Version 7 Unix. Design A UFS volume is composed of the following ...
. ext2 was also designed with extensibility in mind, with space left in many of its on-disk data structures for use by future versions. Since then, ext2 has been a testbed for many of the new extensions to the VFS API. Features such as the withdrawn
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interf ...
draft ACL proposal and the withdrawn
extended attribute Extended file attributes are file system features that enable users to associate computer files with metadata not interpreted by the filesystem, whereas regular attributes have a purpose strictly defined by the filesystem (such as permissions or ...
proposal were generally implemented first on ext2 because it was relatively simple to extend and its internals were well understood. On Linux kernels prior to 2.6.17, restrictions in the block driver mean that ext2 filesystems have a maximum file size of 2 TiB. ext2 is still recommended over journaling file systems on bootable USB flash drives and other
solid-state drive A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is ...
s. ext2 performs fewer writes than ext3 because there is no journaling. As the major aging factor of a flash chip is the number of erase cycles, and as erase cycles happen frequently on writes, decreasing writes increases the life span of the solid-state device. Another good practice for filesystems on flash devices is the use of the ''no atime'' mount option, for the same reason.


ext2 data structures

The space in ext2 is split up into
block Block or blocked may refer to: Arts, entertainment and media Broadcasting * Block programming, the result of a programming strategy in broadcasting * W242BX, a radio station licensed to Greenville, South Carolina, United States known as ''96.3 ...
s. These blocks are grouped into block groups, analogous to cylinder groups in the
Unix File System The Unix file system (UFS) is a family of file systems supported by many Unix and Unix-like operating systems. It is a distant descendant of the original filesystem used by Version 7 Unix. Design A UFS volume is composed of the following p ...
. There are typically thousands of blocks on a large file system. Data for any given file is typically contained within a single block group where possible. This is done to minimize the number of disk seeks when reading large amounts of contiguous data. Each block group contains a copy of the superblock and block group descriptor table, and all block groups contain a block bitmap, an inode bitmap, an inode table, and finally the actual data blocks. The superblock contains important information that is crucial to the booting of the
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
. Thus backup copies are made in multiple block groups in the file system. However, typically only the first copy of it, which is found at the first block of the file system, is used in the booting. The group descriptor stores the location of the block bitmap, inode bitmap, and the start of the inode table for every block group. These, in turn, are stored in a group descriptor table.


Inodes

Every file or directory is represented by an inode. The term "inode" comes from "index node" (over time, it became i-node and then inode). The inode includes data about the size, permission, ownership, and location on disk of the file or directory. Example of ext2 inode structure:
Quote from the Linux kernel documentation for ext2:
"There are pointers to the first 12 blocks which contain the file's data in the inode. There is a pointer to an indirect block (which contains pointers to the next set of blocks), a pointer to a doubly indirect block and a pointer to a trebly indirect block."
Thus, there is a structure in ext2 that has 15 pointers. Pointers 1 to 12 point to direct blocks, pointer 13 points to an indirect block, pointer 14 points to a doubly indirect block, and pointer 15 points to a triply indirect block.


Directories

Each directory is a list of directory entries. Each directory entry associates one file name with one inode number, and consists of the inode number, the length of the file name, and the actual text of the file name. To find a file, the directory is searched front-to-back for the associated filename. For reasonable directory sizes, this is fine. But for very large directories this is inefficient, and ext3 offers a second way of storing directories (
HTree An HTree is a specialized tree data structure for directory indexing, similar to a B-tree. They are constant depth of either one or two levels, have a high fanout factor, use a hash of the filename, and do not require balancing. The HTree algor ...
) that is more efficient than just a list of filenames. The root directory is always stored in inode number two, so that the file system code can find it at mount time. Subdirectories are implemented by storing the name of the subdirectory in the name field, and the inode number of the subdirectory in the inode field. Hard links are implemented by storing the same inode number with more than one file name. Accessing the file by either name results in the same inode number, and therefore the same data. The special directories "." (current directory) and ".." (parent directory) are implemented by storing the names "." and ".." in the directory, and the inode number of the current and parent directories in the inode field. The only special treatment these two entries receive is that they are automatically created when any new directory is made, and they cannot be deleted.


Allocating data

When a new file or directory is created, ext2 must decide where to store the data. If the disk is mostly empty, then data can be stored almost anywhere. However, clustering the data with related data will minimize seek times and maximize performance. ext2 attempts to allocate each new directory in the group containing its parent directory, on the theory that accesses to parent and children directories are likely to be closely related. ext2 also attempts to place files in the same group as their directory entries, because directory accesses often lead to file accesses. However, if the group is full, then the new file or new directory is placed in some other non-full group. The data blocks needed to store directories and files can be found by looking in the data allocation bitmap. Any needed space in the inode table can be found by looking in the inode allocation bitmap.


File-system limits

The reason for some limits of ext2 are the file format of the data and the operating system's kernel. Mostly these factors will be determined once when the file system is built. They depend on the block size and the ratio of the number of blocks and inodes. In Linux the block size is limited by the architecture
page size A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory, described by a single entry in the page table. It is the smallest unit of data for memory management in a virtual memory operating system. Similarly, a p ...
. There are also some userspace programs that cannot handle files larger than 2 GiB. If ''b'' is the block size, the maximal file size is limited to min( ((''b''/4)3 + (''b''/4)2 + ''b''/4 + 12) × ''b'', (232 − 1) × 512 ) due to the i_block structure (an array of direct/indirect EXT2_N_BLOCKS) and i_blocks (32-bit integer value) representing the number of 512-byte "blocks" in the file. The maximal number of sublevel-directories is 31998, due to the link-count limit. Directory indexing is not available in ext2, so there are performance issues for directories with a large number of files (>10,000). The theoretical limit on the number of files in a directory is 1.3 × 1020, although this is not relevant for practical situations. Note: In Linux 2.4 and earlier, block devices were limited to 2 TiB, limiting the maximal size of a partition, regardless of block size.


Compression extension

e2compr is a modification to the ext2 driver in the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
to support compression and decompression of files by the file system, without any support by user applications. e2compr is a small patch against ext2. e2compr compresses only regular files; the administrative data (superblock, inodes,
directory Directory may refer to: * Directory (computing), or folder, a file system structure in which to store computer files * Directory (OpenVMS command) * Directory service, a software application for organizing information about a computer network's u ...
files, etc.) are not compressed (mainly for safety reasons). Access to compressed blocks is provided for read and write operations. The
compression algorithm In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...
and cluster size is specified on a per-file basis. Directories can also be marked for compression, in which case every newly created file in the directory will be automatically compressed with the same cluster size and the same algorithm that was specified for the directory. e2compr is not a new file system. It is only a patch to ext2 made to support the EXT2_COMPR_FL flag. It does not require user to make a new partition, and will continue to read or write existing ext2 file systems. One can consider it as simply a way for the read and write routines to access files that could have been created by a simple utility similar to gzip or compress. Compressed and uncompressed files coexist nicely on ext2 partitions. The latest e2compr-branch is available for current releases of Linux 2.4, 2.6, and 3.0. The latest patch for Linux 3.0 was released in August 2011 and provides
multicore A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions (such a ...
and High memory support. There are also branches for Linux 2.0 and 2.2.


Under other operating systems

Access to ext2 partitions under Microsoft Windows is possible through an
Installable File System The Installable File System (IFS) is a filesystem API in MS-DOS/PC DOS 4.x, IBM OS/2 and Microsoft Windows that enables the operating system to recognize and load drivers for file systems. History When IBM and Microsoft were co-developing OS ...
, such as ext2ifs or
ext2Fsd Ext2Fsd (short for Ext2 File System Driver) is a free Installable File System driver written in C for the Microsoft Windows operating system family. It facilitates read and write access to the ext2, ext3 and ext4 file systems. The driver can b ...
.
Filesystem in Userspace Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in use ...
can be used on macOS.


See also

*
e2fsprogs e2fsprogs (sometimes called the e2fs programs) is a set of utilities for maintaining the ext2, ext3 and ext4 file systems. Since those file systems are often the default for Linux distributions, it is commonly considered to be essential software. ...
*
StegFS StegFS is a free steganographic file system for Linux based on the ext2 filesystem. It is licensed under the GPL. It was principally developed by Andrew D. McDonald and Markus G. Kuhn. The last version of StegFS is 1.1.4, released February 14, ...
– a
steganographic file system Steganographic file systems are a kind of file system first proposed by Ross Anderson, Roger Needham, and Adi Shamir. Their paper proposed two main methods of hiding data: in a series of fixed size files originally consisting of random bits on top ...
based on ext2 * cloop *
List of file systems The following lists identify, characterize, and link to more thorough information on Computer file systems. Many older operating systems support only their one "native" file system, which does not bear any name apart from the name of the operating ...
*
Comparison of file systems The following tables compare general and technical information for a number of file systems. General information Limits Metadata Features File capabilities Block capabilities Note that in addition to the below table, blo ...
*
Orlov block allocator The Orlov block allocator is an algorithm to define where a particular file will reside on a given file system (blockwise), so as to speed up disk operations. Etymology The scheme is named after its creator Grigoriy Orlov, who first posted, in 2 ...
,
Linux Kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
- determined default block allocator for ext2.


References


Notes

*
Sourceforge e2compr projectSourceforge e2compr documentationSourceforge e3compr project page, ext3 compression, alphaDr. Dobb's Data Compression Newsletter Issue #46 - September 2003


Further reading

* * * * * *


External links


Ext2Fsd
GPL ext2/ext3 file system driver for Windows 2000/XP/2003/VISTA/2008 (opensource, supports read & write, works with FreeOTFE) {{Filesystem 1993 software Disk file systems File systems supported by the Linux kernel