CacheFS is the name used for several similar software technologies designed to speed up
distributed file system
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
file access for networked computers. These technologies operate by storing (
cached) copies of files on secondary memory, typically a local
hard disk
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with mag ...
, so that if a file is accessed again, it can be done locally at much higher speeds than networks typically allow.
CacheFS software is used on several
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems. The original Unix version was developed by
Sun Microsystems
Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, ...
in 1993. Another version was written for Linux and released in 2003.
Network filesystems are dependent on a
network
Network, networking and networked may refer to:
Science and technology
* Network theory, the study of graphs as a representation of relations between discrete objects
* Network science, an academic field that studies complex networks
Mathematics ...
link and a remote
server
Server may refer to:
Computing
*Server (computing), a computer program or a device that provides functionality for other programs or devices, called clients
Role
* Waiting staff, those who work at a restaurant or a bar attending customers and su ...
; obtaining a file from such a
filesystem
In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
can be significantly slower than getting the file locally. For this reason, it can be desirable to cache data from these filesystems on a local disk, thus potentially speeding up future accesses to that data by avoiding the need to go to the network and fetch it again. The software has to check that the remote file has not changed since it was cached, but this is much faster than reading the whole file again.
Prior art
Sprite (operating system)
Sprite is an experimental Unix-like distributed operating system developed at the University of California, Berkeley by John Ousterhout's research group between 1984 and 1992. Its notable features include support for single system image on compute ...
used large disk block caches. These were located in main-memory to achieve high performance in its file system. The term CacheFS has found little or no use to describe caches in main memory.
Grossmont version
The first CacheFS implementation, in 6502 assembler, was a write through cache developed by Mathew R Mathews at Grossmont College. It was used from Fall 1986 to Spring 1990 on three diskless 64 kB main memory Apple IIe computers to cache files from a Nestar file server onto Big Board, a 1 MB DRAM secondary memory device partitioned into CacheFS and TmpFS. The computers ran Pineapple DOS, an Apple DOS 3.3 derivative developed in the course of a follow on to WR Bornhorst's NSF funded Instructional Computing System. Pineapple DOS features, including caching, were unnamed; the name CacheFS was introduced seven years later by Sun Microsystems.
Sun version
The first Unix CacheFS implementation was developed by
Sun Microsystems
Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, ...
and released in the
Solaris 2.3 operating system release in 1993, as part of an expanded feature set for the
NFS or Network File System suite known as
Open Network Computing Plus (ONC+).
[New Features in Solaris 2.4](_blank)
in the Solaris 2.4 AnswerBook documentation, Sun Microsystems, 1994. Accessed Sept 10, 2007 It was subsequently used in other UNIX operating systems such as
Irix
IRIX ( ) is a discontinued operating system developed by Silicon Graphics (SGI) to run on the company's proprietary MIPS workstations and servers. It is based on UNIX System V with BSD extensions. In IRIX, SGI originated the XFS file system ...
(starting with the 5.3 release in 1994).
[ IRIX 6.5 ONC3/NFS Administrators Guide](_blank)
, Silicon Graphics, 2005. Accessed Sept 10, 2007
{{Webarchive, url=https://web.archive.org/web/20071019203914/http://ryan.tliquest.net/sgi/irix_versions.html#I5 , date=2007-10-19 , Ryan Thoryk, revision of January 18, 2007. Accessed Sept 10, 2007
Linux version
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
operating systems now commonly use a new version of CacheFS developed by David Howells. Howells appears to have rewritten CacheFS from scratch, not using Sun's original code.
The Linux CacheFS currently is designed to operate on
Andrew File System
The Andrew File System (AFS) is a distributed file system which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University as part of t ...
and
Network File System
Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, li ...
filesystems.
Terminology
Because of its similar naming to FS-Cache, CacheFS'
terminology
Terminology is a group of specialized words and respective meanings in a particular field, and also the study of such terms and their use; the latter meaning is also known as terminology science. A ''term'' is a word, compound word, or multi-wo ...
is confusing to outsiders. CacheFS is a backend for FS-Cache and handles the actual data storage and retrieval. FS-Cache passes the requests from netfs to CacheFS.
FS-Cache
The cache facility/layer between the cache backends just like CacheFS and NFS or AFS.
Cache backends
CacheFS
CacheFS is a Filesystem for the FS-Cache facility. A
block device
In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow ...
can be used as cache by simply
mounting it. Needs no special activation and is deactivated by unmounting it.
Cachefiles (daemon)
Daemon
Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and mythology and of later Hellenistic religion and philosophy.
The wo ...
using an existing filesystem (
ext3
ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on exten ...
with user_xattr) as cache. Cache is bound with "cachefilesd -s".
Project status
Project status seems to be stalled, and some people are attempting to revive the code and bring it up to date.
[Gilliam, Paul]
"linux-cachefs mailing list", September 29, 2010
Features
The facility can be conceptualised by the following
diagram
A diagram is a symbolic representation of information using visualization techniques. Diagrams have been used since prehistoric times on walls of caves, but became more prevalent during the Enlightenment. Sometimes, the technique uses a three ...
:
The facility (known as FS-Cache) is designed to be as transparent as possible to a user of the system. Applications should just be able to use NFS files as normal, without any knowledge of there being a cache.
See also
*
Page cache
In computing, a page cache, sometimes also called disk cache, is a transparent cache for the pages originating from a secondary storage device such as a hard disk drive (HDD) or a solid-state drive (SSD). The operating system keeps a page cach ...
References
External links
Fscache-ols2006 PresentationD.Howells@Red HatSteve D.@Red HatRed Hat CacheFS mailinglistOutdated articles?
LWN.NETA general caching filesystem
LWN.NET Initial mail introducing cacheFS for Linux
Network file systems