
In
computer systems
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
, a snapshot is the
state
State most commonly refers to:
* State (polity), a centralized political organization that regulates law and society within a territory
**Sovereign state, a sovereign polity in international law, commonly referred to as a country
**Nation state, a ...
of a system at a particular point in time. The term was coined as an analogy to that in
photography
Photography is the visual arts, art, application, and practice of creating images by recording light, either electronically by means of an image sensor, or chemically by means of a light-sensitive material such as photographic film. It is empl ...
.
Rationale
A full
backup
In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "wikt:back ...
of a large data set may take a long time to complete. On
multi-tasking or
multi-user systems, there may be writes to that data while it is being backed up. This prevents the backup from being
atomic and introduces a version skew that may result in
data corruption
Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data. Computer, transmission, and storage systems use a number of meas ...
. For example, if a user moves a file into a directory that has already been backed up, then that file would be completely missing on the
backup media, since the backup operation had already taken place before the addition of the file. Version skew may also cause corruption with files which change their size or contents underfoot while being read.
One
approach
Approach may refer to:
Aviation
*Visual approach
*Instrument approach
* Final approach
Music
* ''Approach'' (album), by Von Hertzen Brothers
* ''The Approach'', an album by I:Scintilla
Other uses
*Approach Beach, a gazetted beach in Ting Kau, H ...
to safely backing up live data is to temporarily disable write access to data during the backup, either by stopping the accessing applications or by using the
locking API
An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
provided by the operating system to enforce exclusive read access. This is tolerable for low-availability systems (on desktop computers and small workgroup servers, on which regular
downtime
In computing and telecommunications, downtime (also (system) outage or (system) drought colloquially) is a period when a system is unavailable. The unavailability is the proportion of a time-span that a system is unavailable or offline.
This is ...
is acceptable). High-availability
24/7 systems, however, cannot bear service stoppages.
To avoid downtime, high-availability systems may instead perform the backup on a ''snapshot''—a
read-only copy of the data set frozen at a
point in time—and allow applications to continue writing to their data. Most snapshot implementations are efficient and can create snapshots in ''
O(1)''. In other words, the time and I/O needed to create the snapshot does not increase with the size of the data set; by contrast, the time and I/O required for a direct backup is proportional to the size of the data set. In some systems once the initial snapshot is taken of a data set, subsequent snapshots copy the changed data only, and use a system of pointers to reference the initial snapshot. This method of pointer-based snapshots consumes less disk capacity than if the data set was repeatedly cloned.
Implementations
Volume managers
Some
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
systems have snapshot-capable
logical volume managers. These implement
copy-on-write
Copy-on-write (COW), also called implicit sharing or shadowing, is a resource-management technique used in programming to manage shared data efficiently. Instead of copying data right away when multiple programs use it, the same data is shared ...
on entire
block devices by copying changed blocksjust before they are to be overwritten within "parent" volumesto other storage, thus preserving a self-consistent past image of the block device. Filesystems on such snapshot images can later be mounted as if they were on a read-only media.
Some volume managers also allow creation of ''writable'' snapshots, extending the copy-on-write approach by disassociating any blocks modified within the snapshot from their "parent" blocks in the original volume. Such a scheme could be also described as performing additional copy-on-write operations triggered by the writes to snapshots.
On Linux,
Logical Volume Manager (LVM) allows creation of both read-only and read-write snapshots. Writable snapshots were introduced with the LVM version 2 (LVM2).
File systems
Some file systems, such as
WAFL,
fossil
A fossil (from Classical Latin , ) is any preserved remains, impression, or trace of any once-living thing from a past geological age. Examples include bones, shells, exoskeletons, stone imprints of animals or microbes, objects preserve ...
for
Plan 9 from Bell Labs
Plan 9 from Bell Labs is a distributed operating system which originated from the Computing Science Research Center (CSRC) at Bell Labs in the mid-1980s and built on UNIX concepts first developed there in the late 1960s. Since 2000, Plan 9 has ...
, and
ODS-5, internally track old versions of files and make snapshots available through a special
namespace
In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.
Namespaces ...
. Others, like
UFS2, provide an operating system
API
An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
for accessing file histories. In
NTFS
NT File System (NTFS) (commonly called ''New Technology File System'') is a proprietary journaling file system developed by Microsoft in the 1990s.
It was developed to overcome scalability, security and other limitations with File Allocation Tabl ...
, access to snapshots is provided by the Volume Shadow-copying Service (VSS) in
Windows XP
Windows XP is a major release of Microsoft's Windows NT operating system. It was released to manufacturing on August 24, 2001, and later to retail on October 25, 2001. It is a direct successor to Windows 2000 for high-end and business users a ...
and
Windows Server 2003
Windows Server 2003, codenamed "Whistler Server", is the sixth major version of the Windows NT operating system produced by Microsoft and the first server version to be released under the Windows Server brand name. It is part of the Windows NT ...
and
Shadow Copy in
Windows Vista
Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
. Melio FS provides snapshots via the same VSS interface for shared storage. Snapshots have also been available in the NSS (
Novell Storage Services) file system on
NetWare
NetWare is a discontinued computer network operating system developed by Novell, Inc. It initially used cooperative multitasking to run various services on a personal computer, using the IPX network protocol. The final update release was ver ...
since version 4.11, and more recently on
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
platforms in the
Open Enterprise Server product. On
AIX,
JFS2 supports snapshots. On
macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
,
APFS supports snapshots.
EMC's Isilon OneFS clustered storage platform implements a single scalable file system that supports read-only snapshots at the file or directory level. Any file or directory within the file system can be snapshotted and the system will implement a copy-on-write or point-in-time snapshot dynamically based on which method is determined to be optimal for the system.
On Linux, the
Btrfs
Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or "B.T.R.F.S.") is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (distinct from Linux's LVM), d ...
and
OCFS2 file systems support creating snapshots (cloning) of individual files. Additionally, Btrfs also supports the creation of snapshots of subvolumes.
See also
*
Application checkpointing
Checkpointing is a technique that provides fault tolerance for computing systems. It involves saving a Snapshot (computer storage), snapshot of an application software, application's state, so that it can restart from that point in case of failure ...
*
Persistence (computer science)
In computer science, persistence refers to the characteristic of State (computer science), state of a system that outlives (persists for longer than) the Process (computing), process that created it. This is achieved in practice by storing the st ...
*
Sandbox (computer security)
In computer security, a sandbox is a security mechanism for separating running programs, usually in an effort to mitigate system failures and/or software vulnerabilities from spreading. The sandbox metaphor derives from the concept of a child's ...
*
Storage Hypervisor
*
System image
*
Virtual machine
In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...
Notes
References
External links
*
* {{cite web, title=Storage Basics: Backup Strategies, date=2003-09-24, first=Mike, last=Harwood, url=http://www.enterprisestorageforum.com/management/features/article.php/3082691
Backup
Fault-tolerant computer systems
Persistence