Continuous Data Protection
   HOME

TheInfoList



OR:

Continuous data protection (CDP), also called continuous backup or real-time backup, refers to
backup In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", w ...
of
computer data In computer science, data (treated as singular, plural, or as a mass noun) is any sequence of one or more symbols; datum is a single symbol of data. Data requires interpretation to become information. Digital data is data that is represented us ...
by automatically saving a copy of every change made to that data, essentially capturing every version of the data that the user saves. In its true form it allows the user or administrator to restore data to any point in time. The technique was
patent A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an enabling disclosure of the invention."A p ...
ed by
British British may refer to: Peoples, culture, and language * British people, nationals or natives of the United Kingdom, British Overseas Territories, and Crown Dependencies. ** Britishness, the British identity and common culture * British English, ...
entrepreneur Pete Malcolm in 1989 as "a backup system in which a ''copy'' ditor's emphasisof every change made to a storage medium ''is recorded as the change occurs'' ditor's emphasis" In an ''ideal'' case of ''continuous data protection'', the
recovery point objective Disaster recovery is the process of maintaining or reestablishing vital infrastructure and systems following a natural or human-induced disaster, such as a storm or battle.It employs policies, tools, and procedures. Disaster recovery focuses on t ...
—"the maximum targeted period in which data (transactions) might be lost from an IT service due to a major incident"—is zero, even though the
recovery time objective Disaster recovery is the process of maintaining or reestablishing vital infrastructure and systems following a natural or human-induced disaster, such as a storm or battle.It employs policies, tools, and procedures. Disaster recovery focuses on t ...
—"the targeted duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity"—is not zero. An example of a period in which data transactions ''might'' be lost is a major discount chain having card readers at its checkout counters shut down at multiple locations for close to two hours in the month of June 2019. CDP runs as a service that captures changes to data to a separate storage location. There are multiple methods for capturing continuous live data changes involving different technologies that serve different needs. ''True'' CDP-based solutions can provide fine granularities of restorable objects ranging from crash-consistent images to logical objects such as files, mail boxes, messages, and database files and logs. This isn't necessarily true of ''near''-CDP solutions.


Differences from traditional backup

''True'' continuous data protection is different from traditional backup in that it is not necessary to specify the point in time to recover from until ready to restore. Traditional backups only restore data from the time the backup was made. ''True'' continuous data protection, in contrast to "snapshots", has ''no'' backup schedules. When data is written to disk, it is also asynchronously written to a second location, either another computer over the network or an appliance. This introduces some overhead to disk-write operations but eliminates the need for scheduled backups. Allowing restoring data to any point in time, "CDP is the gold standard—the most comprehensive and advanced data protection. But 'near CDP' technologies can deliver enough protection for many companies with less complexity and cost. For example, snapshots the_section_below.html" ;"title="Continuous_Data_Protection#Continuous_vs_near_continuous">the section below">Continuous_Data_Protection#Continuous_vs_near_continuous">the section belowcan provide a reasonable near-CDP-level of protection for file shares, letting users directly access data on the file share at regular intervals—say, every half hour or 15 minutes. That's certainly a higher level of protection than tape-based or disk-based nightly backups and may be all you need." Because "near-CDP does this opyingat pre-set time intervals", it is essentially
incremental backup An incremental backup is one in which successive copies of the data contain only the portion that has changed since the preceding backup copy was made. When a full recovery is needed, the restoration process would need the last full backup plus al ...
initiated—separately for each source machine—by timer instead of script.


Continuous vs near continuous

Since ''true'' CDP "backup write operations are executed at the level of the basic input/output system (BIOS) of the microcomputer in such a manner that normal use of the computer is unaffected", ''true'' CDP backup must in practice be run in conjunction with a
virtual machine In computing, a virtual machine (VM) is the virtualization/emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized hardw ...
or equivalent—ruling it out for ordinary ''personal'' backup applications. It is therefore discussed in the "Enterprise client-server backup" article, rather than in the "
Backup In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", w ...
" article. Some solutions ''marketed'' as continuous data protection may only allow restores at fixed intervals such as 15 minutes or one hour or 24 hours, because they automatically take
incremental Increment or incremental may refer to: *Incrementalism, a theory (also used in politics as a synonym for gradualism) *Increment and decrement operators, the operators ++ and -- in computer programming *Incremental computing *Incremental backup, wh ...
backups at those intervals. Such "near-CDP"—short for near-continuous data protection—schemes are not universally recognized as true continuous data protection, as they do not provide the ability to restore to any point in time. When the interval is shorter than one hour, "near-CDP" solutions—for example Arq Backup—are typically based on periodic "snapshots"; "to avoid downtime, high-availability systems may instead perform the backup on ... a read-only copy of the data set frozen at a point in time—and allow applications to continue writing to their data". There is debate in the industry as to whether the
granularity Granularity (also called graininess), the condition of existing in granules or grains, refers to the extent to which a material or system is composed of distinguishable pieces. It can either refer to the extent to which a larger entity is subd ...
of backup must be "every write" to be CDP, or whether a "near-CDP" solution that captures the data every few minutes is good enough. The latter is sometimes called near continuous backup. The debate hinges on the use of the term ''continuous'': whether only the backup ''process'' must be continuously ''automatically scheduled'', which is often sufficient to achieve the benefits cited above, or whether the ability to ''restore'' from the backup also must be continuous. The
Storage Networking Industry Association The Storage Networking Industry Association (SNIA) is a registered 501(c)(6) non-profit trade association incorporated in December 1997. SNIA has more than 185 unique members, 2,000 active contributing members and over 50,000 IT end users and sto ...
(SNIA) uses the "every write" definition. There is a briefer sub-sub-section in the "Backup" article about this, now renamed to "Near-CDP" to avoid confusion.


Differences from RAID, replication or mirroring

Continuous data protection differs from
RAID Raid, RAID or Raids may refer to: Attack * Raid (military), a sudden attack behind the enemy's lines without the intention of holding ground * Corporate raid, a type of hostile takeover in business * Panty raid, a prankish raid by male college ...
, replication, or mirroring in that these technologies only protect one copy of the data (the most recent). If data becomes corrupted in a way that is not immediately detected, these technologies simply protect the corrupted data with no way to restore an uncorrupted version. Continuous data protection protects against some effects of data corruption by allowing restoration of a previous, uncorrupted version of the data. Transactions that took place between the corrupting event and the restoration are lost, however. They could be recovered through other means, such as journaling.


Backup disk size

In some situations, continuous data protection requires less space on backup media (usually disk) than traditional backup. Most continuous data protection solutions save ''
byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
or block-level'' differences rather than ''file-level differences''. This means that if one byte of a 100 GB file is modified, only the changed byte or block is backed up. Traditional incremental and differential backups make copies of entire files; however starting around 2013 enterprise client-server backup applications have implemented a capability for block-level ''incremental'' backup, designed for large files such as databases.


Risks and disadvantages

When real-time edits—especially in
multimedia Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradition ...
and
CAD Computer-aided design (CAD) is the use of computers (or ) to aid in the creation, modification, analysis, or optimization of a design. This software is used to increase the productivity of the designer, improve the quality of design, improve co ...
design environments—are backed up offsite over the upstream channel of the installation's broadband network, network bandwidth throttling may be needed to reduce the impact of ''true'' CDP. An alternative approach is to back up to a separate Fibre-Channel-connected SAN appliance.


See also

*
Quest AppAssure AppAssure was a backup and recovery software company founded in 2006 and based in Reston, Virginia. It was purchased by Dell in 2012. It has since been subsumed by Quest Rapid Recovery. Company history AppAssure was founded by Najaf Husain in ...
*
Cofio Software Cofio Software, headquartered in San Diego, California, was a privately held software company founded in 2006 which produced a product called AIMstor. After being acquired in 2012 the product became known as the Hitachi Data Instance Director. a ...
* Disaster recovery * EMC RecoverPoint *
FalconStor FalconStor is a data management software company based in Austin, Texas. History FalconStor was co-founded in 2000 in New York by Computer Associates veterans ReiJane Huai and Wayne Lam. In 2007 the company started a joint-venture with the Chi ...
* InMage DR-Scout *
List of backup software This is a list of notable backup software that performs data backups. Archivers, transfer protocols, and version control systems are often used for backups but only software focused on backup is listed here. See Comparison of backup software f ...
*
List of online backup services This is a comparison of online backup services. Online backup is a special kind of online storage service; however, various products that are designed for file storage may not have features or characteristics that others designed for backup have ...
*
Single instance storage Single-instance storage (SIS) is a system's ability to take multiple copies of content and replace them by a single shared copy. It is a means to eliminate data duplication and to increase efficiency. SIS is frequently implemented in file system ...
*
CloudEndure CloudEndure is a cloud computing company that develops business continuity software for disaster recovery, continuous backup, and live migration. CloudEndure is headquartered in the United States with R&D in Israel. History CloudEndure was fou ...


References

{{reflist Computer data Backup