HOME

TheInfoList



OR:

"Everything is a file" is an approach to interface design in
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
derivatives. While this turn of phrase does not as such figure as a Unix design principle or
philosophy Philosophy ('love of wisdom' in Ancient Greek) is a systematic study of general and fundamental questions concerning topics like existence, reason, knowledge, Value (ethics and social sciences), value, mind, and language. It is a rational an ...
, it is a common way to analyse designs, and informs the design of new interfaces in a way that prefers, in rough order of import: # representing objects as
file descriptor In Unix and Unix-like computer operating systems, a file descriptor (FD, less frequently fildes) is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket. File descriptors typically h ...
s in favour of alternatives like abstract handles or names, # operating on the objects with standard
input/output In computing, input/output (I/O, i/o, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, such as another computer system, peripherals, or a human operator. Inputs a ...
operations returning byte streams to be interpreted by applications (rather than explicitly structured data), and # allowing the usage or creation of objects by opening or creating files in the global filesystem name space. The lines between the common interpretations of "file" and "file descriptor" are often blurred when analysing Unix, and nameability of ''files'' is the least important part of this principle; thus, it is sometimes described as "Everything is a file descriptor". This approach is interpreted differently with time, philosophy of each system, and the domain to which it's applied. The rest of this article demonstrates notable examples of some of those interpretations, and their repercussions.


Objects as file descriptors

Under Unix, a directory can be opened like a regular file, containing fixed-size records of ''(i-node, filename)'', but directories cannot be written to directly, and are modified by the kernel as a side-effect of creating and removing files within the directory. Some interfaces only follow a subset of these guidelines, for example
pipes Pipe(s), PIPE(S) or piping may refer to: Objects * Pipe (fluid conveyance), a hollow cylinder following certain dimension rules ** Piping, the use of pipes in industry * Smoking pipe ** Tobacco pipe * Half-pipe and quarter pipe, semi-circu ...
do ''not'' exist on the filesystem — pipe() creates a pair of unnameable file descriptors. The later invention of named pipes (FIFOs) by
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
fills this gap. This does not mean that the ''only'' operations on an object are reading and writing: ioctl() and similar interfaces allow for object-specific operations (like controlling tty characteristics), directory file descriptors can be used to alter path look-ups (with a growing number of *at() system call variants like openat()) or to change the working directory to the one represented by the file descriptor, in both cases preventing
race condition A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events, leading to unexpected or inconsistent ...
s and being faster than the alternative of looking up the entire path. Socket file descriptors require configuration (setting the remote address and connecting) after creation before being used for I/O. A server socket may not be used for I/O directly at all — in connection-based protocols, bind() assigns a local address to a socket, and listen() uses that socket to wait until a remote process connects, then returns a ''new'' socket file descriptor representing that direct bidirectional connection. This approach allows management of objects used by a program in a standardised manner, just like any other file — after binding to an address privileges may be dropped, the server socket may be distributed among many processes by fork()ing (respectively closed in subprocesses that should not have access), or the individual connections' sockets may be given as standard input/output to specialised handlers for those connections, as in the super-server/ CGI/
inetd inetd (internet service daemon) is a super-server Daemon (computer software), daemon on many Unix systems that provides Internet services. For each configured service, it listens for requests from connecting clients. Requests are served by spawn ...
paradigms. Many interfaces present in early Unixes that do not use file descriptors became duplicated in later designs: the alarm()/setitimer() system calls schedule the delivery of a signal after the specified time elapses; this timer is inherited by children, and persists after exec(). The POSIX timer_create() API serves a similar function, but destroys the timer in child processes and on exec(); these timers identified by opaque handles. Both interfaces always deliver their completions asynchronously, and cannot be poll()ed/ select()ed, making their integration into a complex event loop more difficult. The timerfd design (originally found in
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
), turns each timer object into a file descriptor, which can be individually observed with poll() &c. and whose inheritance to child processes can be controlled with the standard close()/CLOEXEC/CLOFORK controls. While the POSIX API has timer_getoverrun() that returns how many times the timer elapsed, this is returned as the result of read() from a timerfd. This operation blocks, so waiting until a timerfd elapses is as easy as reading from it. There is no way to atomically do this with classic Unix or POSIX timers. The timer can be inspected non-blockingly by performing a non-blocking read (a standard I/O operation).


Objects in the filesystem namespace


Special file types

Device special files are a defining characteristic of Unix: initially, opening a regular file with i-node number ≤40 (traditionally stored under /dev) instead returned a file descriptor corresponding to a device, and handled by the device driver. The magic i-node number scheme later became codified into files with type S_IFBLK/S_IFCHR. Opening special files is beholden to the same
file-system permissions Typically, a file system maintains permission settings for each stored item commonly computer file, files and directory (computer), directories that either grant or deny the ability to manipulate file system items. Often the settings allow cont ...
checks as opening regular files, allowing common access control — chown dmr /usr/dmr /dev/rk0; chmod o= /usr/dmr /dev/rk0 changes the ownership and file access mode of both the directory /usr/dmr and device /dev/rk0. For block devices ( hard disks and tape drives), due to their size, this meant unique semantics: they were block-addressed (see ), and programs needed to be written specifically to work correctly with them. This is described as "extremely unfortunate", and later interfaces alleviate this. In many cases, magnetic tapes continue to have unique semantics: some tapes can be partitioned into "files" and the driver signals an end-of-file condition after the end of a partition is reached, so cp /dev/nrst0 file1; cp /dev/nrst0 file2 will create file1 and file2 consisting of two consecutive partitions of the tape — the driver provides an
abstraction layer In computing, an abstraction layer or abstraction level is a way of hiding the working details of a subsystem. Examples of software models that use layers of abstraction include the OSI model for network protocols, OpenGL, and other graphics libra ...
that presents a tape file descriptor as-if it were a regular file to fit into the Everything is a file paradigm. Specialised programs like mt are used to move between partitions on a tape like this, Named pipes (FIFOs) appear as S_IFIFO-type files in the filesystem, can be renamed, and may be opened like regular files. Under Unix derivatives, Unix-domain sockets appear as S_IFSOCK-type files in the filesystem, can be renamed, but cannot be open()ed — one must create the correct type of socket file descriptor and connect() explicitly. Under Plan 9, sockets in the filesystem may be opened like regular files.


As a replacement for dedicated system calls

Modern systems contain high-performance I/O event notification facilities —
kqueue Kqueue is a scalable event notification interface introduced in FreeBSD 4.1 in July 2000, also supported in NetBSD, OpenBSD, DragonFly BSD, and macOS. Kqueue was originally authored in 2000 by Jonathan Lemon, then involved with the FreeBSD Core T ...
(
BSD The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
derivatives),
epoll epoll is a Linux kernel system call for a scalable I/O event notification mechanism, first introduced in version 2.5.45 of the Linux kernel. Its function is to monitor multiple file descriptors to see whether I/O is possible on any of them. It is ...
(Linux), IOCP (
Windows NT Windows NT is a Proprietary software, proprietary Graphical user interface, graphical operating system produced by Microsoft as part of its Windows product line, the first version of which, Windows NT 3.1, was released on July 27, 1993. Original ...
, Solaris), /dev/poll (Solaris) — the control object is generally created (kqueue(), epoll_create()) and configured (kevent(), epoll_ctl()) with dedicated system calls. A /dev/poll instance is created by opening the file "/dev/poll" directly, writing configured objects to observe, and ioctl()s for additional configuration. Memory may be allocated by requesting an anonymous memory mapping — one that doesn't correspond to any file. On modern systems this can be done by specifying no file and MAP_ANONYMOUS; in UNIX System V Release 4, this was done by opening /dev/zero, and mmap()ping it.


API filesystems

Operating system
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
s can be implemented as regular system calls, or as synthetic file-systems. In the former case, system state can only be inspected by specially-written programs shipped with the system, and any additional processing desired by the user needs to either filter and parse the output of those programs, execute them to write the desired state, or must be implemented in the native
system programming language A system programming language is a programming language used for system programming; such languages are designed for writing system software, which usually requires different development approaches when compared with application software. Eds ...
. In the latter case, system state is presented as-if it were regular files and directories — on systems with a procfs, information about running processes can be obtained by looking at, canonically, /proc, which contains directories named after the PIDs running on the system, containing files like stat (status) with process metadata, cwd, exe, and rootsymbolic links to the process' working directory, executable image, and root directory — or directories like fd which contains symbolic links to the files the process has opened, named after the file descriptors. Because these attributes are presented as files and symbolic links, standard utilities work on them, and one can, say, inspect the identity of the process with grep Uid /proc/1392400/status, go to the same directory as a process is in with cd /proc/1392400/cwd, look what files a process has open with ls -l /proc/1392400/fd, then open a file that process has open with less /proc/1392400/fd/8. This improves ergonomics over parsing this data from the output of a utility. Under Linux, symbolic links under procfs are "magic": they can actually behave like cross-filesystem hard links to the files they point to. This behaviour allows recovery of files removed from the filesystem but still open by a process, and permanently persisting files created by O_TMPFILE in the filesystem (which otherwise cannot be named). 4.4BSD-derived sysctls are key/value mappings managed by the sysctl program, which lists all variables with sysctl -a, the value of one variable with sysctl net.inet.ip.forwarding, and sets it with sysctl -w net.inet.ip.forwarding=1. Under Linux, the equivalent mechanism is provided by procfs under the /proc/sys tree: the respective operations can be done with find /proc/sys/grep -r ^ /proc/sys, cat /proc/sys/net/ipv4/ip_forward, and echo 1 > /proc/sys/net/ipv4/ip_forward. For convenience or standards conformance, dedicated inspection tools (like ps and sysctl) may still be provided, using these filesystems as data sources/sinks. sysfs and debugfs are similar Linux interfaces for further configuring the kernel: writing mem to /sys/power/state will trigger a suspend-to-RAM procedure, and writing 2 to /sys/module/iwlwifi/parameters/led_mode will start blinking the Wi-Fi LED on activity. These are ''synthetic'' file-systems because the contents of each file are not stored anywhere verbatim: when the file is read, the appropriate kernel
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
s are serialised into the reading process' input buffer, and when the file is written to, the output buffer is parsed. This means that the file abstraction is broken, since the file metadata isn't valid: depending on the filesystem, each file reports a size of 0 or PAGE_SIZE, even though reading the data will yield a different number of bytes.


Notes


See also

* * Unix architecture *
Object-oriented analysis and design Object-oriented analysis and design (OOAD) is a technical approach for analyzing and designing an application, system, or business by applying object-oriented programming, as well as using visual modeling throughout the software development pro ...


References

{{Reflist Information theory Unix file system technology