Large File Support
   HOME

TheInfoList



OR:

Large-file support (LFS) is the term frequently applied to the ability to create files larger than either 2 or 4 
GiB The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
on 32-bit
filesystem In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is Computer data storage, stored and retrieved. Without a file system, data placed in a storage me ...
s.


Details

Traditionally, many operating systems and their underlying file system implementations used 32-bit
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
s to represent
file File or filing may refer to: Mechanical tools and processes * File (tool), a tool used to ''remove'' fine amounts of material from a workpiece **Filing (metalworking), a material removal process in manufacturing ** Nail file, a tool used to gent ...
sizes and positions. Consequently, no file could be larger than 232 − 1 bytes (4 GiB − 1). In many implementations, the problem was exacerbated by treating the sizes as signed numbers, which further lowered the limit to 231 − 1 bytes (2 GiB − 1). Files that were too large for 32-bit operating systems to handle came to be known as ''large files''. While the limit was quite acceptable at a time when hard disks were smaller, the general increase in storage capacity combined with increased server and desktop file usage, especially for
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
and
multimedia Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradit ...
files, led to intense pressure for OS vendors to overcome the limitation. In 1996, multiple vendors responded by forming an industry initiative known as the Large File Summit to support large files on POSIX (at the time Windows NT already supported large files on NTFS), an obvious
backronym A backronym is an acronym formed from an already existing word by expanding its letters into the words of a phrase. Backronyms may be invented with either serious or humorous intent, or they may be a type of false etymology or folk etymology. The ...
of "LFS". The summit was tasked to define a standardized way to switch to
64-bit In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit CPUs and ALUs are those that are based on processor registers, address buses, or data buses of that size. A compu ...
numbers to represent file sizes. This switch caused deployment issues and required design modifications, the consequences of which can still be seen: * The change to 64-bit file sizes frequently required incompatible changes to file system layout, which meant that large-file support sometimes necessitated a file system change. For example, the
FAT32 File Allocation Table (FAT) is a file system developed for personal computers. Originally developed in 1977 for use on floppy disks, it was adapted for use on hard disks and other devices. It is often supported for compatibility reasons by ...
file system does not support files larger than 4 GiB−1 (with older applications even only 2 GiB−1); the variant FAT32+ does support larger files (up to 256 GiB−1), but (so far) is only supported in some versions of DR-DOS, so users of Microsoft Windows have to use
NTFS New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred fil ...
or
exFAT exFAT (Extensible File Allocation Table) is a file system introduced by Microsoft in 2006 and optimized for flash memory such as USB flash drives and SD cards. exFAT was proprietary until 28 August 2019, when Microsoft published its specificati ...
instead. * To support binary compatibility with old applications, operating system
interface Interface or interfacing may refer to: Academic journals * ''Interface'' (journal), by the Electrochemical Society * '' Interface, Journal of Applied Linguistics'', now merged with ''ITL International Journal of Applied Linguistics'' * '' Int ...
s had to retain their use of 32-bit file sizes and new interfaces had to be designed specifically for large-file support. * To support writing
portable Portable may refer to: General * Portable building, a manufactured structure that is built off site and moved in upon completion of site and utility work * Portable classroom, a temporary building installed on the grounds of a school to provide ...
code that makes use of LFS where possible, C standard library authors devised mechanisms that, depending on
preprocessor In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by so ...
constants, transparently redefined the functions to the 64-bit large-file aware ones. * Many old interfaces, especially C-based ones, explicitly specified argument types in a way that did not allow straightforward or transparent transition to 64-bit types. For example, the C functions
fseek The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header . The functionality descends from a "portable I/O package" written by Mike Lesk at ...
and ftell operate on file positions of type long int, which is typically 32 bits wide on 32-bit platforms, and cannot be made larger without sacrificing backward compatibility. (This was resolved by introducing new functions fseeko and ftello in
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inter ...
. On Windows machines, under Visual C++, functions _fseeki64 and _ftelli64 are used.)


Adoption

The usage of the large-file API in 32-bit programs had been incomplete for a long time. An analysis did show in 2002 that many base libraries of operating systems were still shipped without large-file support thereby limiting applications using them. The much-used
zlib zlib ( or "zeta-lib", ) is a software library used for data compression. zlib was written by Jean-loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also ...
library started to support 64-bit large-files on 32-bit platform not before 2006. The problem disappeared slowly with PCs and workstations moving completely to
64-bit computing In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit CPUs and ALUs are those that are based on processor registers, address buses, or data buses of that size. A co ...
. Microsoft Windows Server 2008 has been the last server version to be shipped in 32-bit. Redhat Enterprise Linux 7 was published in 2014 only as a 64-bit operating system. Ubuntu Linux stopped delivering a 32-bit variant in 2019. Nvidia stopped to develop 32-bit drivers in 2018 and deliver updates after January 2019. Apple stopped developing 32-bit Mac OS versions in 2018 delivering
macOS Mojave macOS Mojave ( ; version 10.14) is the fifteenth major release of macOS, Apple Inc.'s desktop operating system for Macintosh computers. Mojave was announced at Apple's Worldwide Developers Conference on June 4, 2018, and was released to the ...
only as a 64-bit operating system. The end-of-life for Windows 10 has been set to 2025 on the desktop which is related to the latest upgrades from old systems like Windows 7 & Windows 8 in January 2020 as some of those system ran on old computers built on the i386 architecture.
Windows 11 Windows 11 is the latest major release of Microsoft's Windows NT operating system, released in October 2021. It is a free upgrade to its predecessor, Windows 10 (2015), and is available for any Windows 10 devices that meet the new Windows 11 ...
however will ship only as a 64-bit operating system since its first version in 2021. A similar development can be seen in the mobile area. Google required to support 64-bit versions of applications in their app store by August 2019, which allows to discontinue 32-bit support for Android later. The shift towards 64-bit started in 2014 when all new processors were designed to a 64-bit architecture and
Android 5 Android Lollipop ( codenamed Android L during development) is the fifth major version of the Android mobile operating system developed by Google and the 12th version of Android, spanning versions between 5.0 and 5.1.1. Unveiled on June 25, 2014 ...
("Lollipop") was published in that year providing a fitting 64-bit variant of the operating system. Apple had made shift in the year before starting to produce the 64-Bit
Apple A7 The Apple A7 is a 64-bit system on a chip (SoC) designed by Apple Inc. It first appeared in the iPhone 5S, which was announced on September 10, 2013, and the iPad Air and iPad Mini 2, which were both announced on October 22, 2013. Apple states ...
by 2013. Google started to deliver the development environment for Linux only in 64-bit by 2015. In May 2019 the share of Android versions below 5 had fallen to ten percent. As app developers concentrate on a single
compilation Compilation may refer to: *In computer programming, the translation of source code into object code by a compiler **Compilation error **Compilation unit *Product bundling, a marketing strategy used to sell multiple products *Compilation thesis M ...
variant, many manufacturers started to require Android 5 as the minimum version by mid 2019, for example Niantic. Subsequently the 32-bit versions were hard to get. Except for
embedded systems An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
with their special programs, the consideration of varying large-file support becomes obsolete in program code after 2020.


Related problems

The
year 2038 problem The year 2038 problem (also known as Y2038, Y2K38, or the Epochalypse) is a time formatting bug in computer systems with representing times after 03:14:07 UTC on 19 January 2038. The problem exists in systems which measure Unix time â ...
is well known for another case where a 32-bit "long" on 32-bit platforms will lead into problems. Just like the large-file limitation it will get obsolete when systems move to 64-bit only. In the meantime a 64-bit timestamp was introduced. In the Win32 API it is visible in functions having a "64" suffix along the earlier "32" suffix. When large-file support was added to the Win32 API it has led to functions having an additional "i64" suffix which sometimes makes for four combinations.(findfirst32, findfirst64, findfirst32i64, findfirst64i32). By comparison the UNIX98 API introduces functions with a "64" suffix when "_LARGEFILE64_SOURCE" is used. Related to the large-file API there is a limitation of block numbers for
mass storage In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion. In general, the term is used as large in relation to contemporaneous hard disk drives, but it has been used large in relati ...
media. With a common size of 512 bytes per data block the barrier resulting from 32-bit numbers did occur later. When
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magne ...
s reached a size of 2 terabyte (around 2010) the master boot record had to be replaced by the
GUID Partition Table The GUID Partition Table (GPT) is a standard for the layout of partition tables of a physical computer storage device, such as a hard disk drive or solid-state drive, using universally unique identifiers, which are also known as globally unique i ...
which uses 64-bit for the LBA numbers ( logical block address). On
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems it did also require to enlarge the inode numbers which are used in some functions (stat64, setrlimit64). The Linux kernel introduced that in 2001 leading to version 2.4 which was picked up by the glibc in that year. As the large-file support and large-disk support was introduced at the same time the
GNU C Library The GNU C Library, commonly known as glibc, is the GNU Project's implementation of the C standard library. Despite its name, it now also directly supports C++ (and, indirectly, other programming languages). It was started in the 1980s by ...
exports 64-bit inode structures on 32-bit architectures at the same time when the Unix LFS API is activated in program code. When the kernel moved to 64-bit inodes the file system
ext3 ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on ext ...
used them internally in the driver by 2001. However the inode format on the storage media itself was stuck at 32-bit numbers. As mass storage devices moved to the
Advanced Format Advanced Format (AF) is any disk sector format used to store data on magnetic disks in hard disk drives (HDDs) that exceeds 512, 520, or 528 bytes per sector, such as the 4096, 4112, 4160, and 4224-byte (4  KB) sectors of an Advanced Format ...
of 4 kilobyte per block the actual limit of that file system format is at 8 or 16 terabyte. Handling larger disk partitions requires the usage of a different file system like
XFS XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; as ...
which was designed with 64-bit inodes from the start allowing for exabyte files and partitions. The first 16 terabyte magnetic disk drives were delivered by mid 2019.
Solid-state drive A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is a ...
with 32 TiB for data centers were available as early as 2016 with some manufacturers forecasting 100 TiB SSD by 2020.


See also

* 2 GB limit *
RF64 {{Infobox file format , name = RF64 , icon = , iconcaption = , icon_size = , screenshot = , screenshot_size = , caption = , _noextcode = , extension = , _nomimecode = , mime = , type_code = , uniform_type = , c ...
– 64-bit support for BWF WAV audio files * Comparison of large-file support in text editors * FAT32+ * File size * Long filename support (LFN) *
Year 2038 problem The year 2038 problem (also known as Y2038, Y2K38, or the Epochalypse) is a time formatting bug in computer systems with representing times after 03:14:07 UTC on 19 January 2038. The problem exists in systems which measure Unix time â ...


References


External links

* {{cite web , author-first=Andreas , author-last=Jaeger , date=2005-02-15 , title=Large File Support in Linux , publisher= SuSE GmbH , url=http://www.suse.de/~aj/linux_lfs.html , access-date=2006-09-10 Computer file systems