Data degradation
   HOME

TheInfoList



OR:

Data degradation is the gradual corruption of
computer data In computer science, data (treated as singular, plural, or as a mass noun) is any sequence of one or more symbols; datum is a single symbol of data. Data requires interpretation to become information. Digital data is data that is represented us ...
due to an accumulation of non-critical failures in a data storage device. The
phenomenon A phenomenon ( : phenomena) is an observable event. The term came into its modern philosophical usage through Immanuel Kant, who contrasted it with the noumenon, which ''cannot'' be directly observed. Kant was heavily influenced by Gottfried ...
is also known as data decay, data rot or bit rot.


Example

Below are several digital images illustrating data degradation, all consisting of 326,272 bits. The original photo is displayed first. In the next image, a single bit was changed from 0 to 1. In the next two images, two and three bits were flipped. On
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, w ...
systems, the binary difference between files can be revealed using command (e.g. ). File:Bitrot in JPEG files, 0 bits flipped.jpg, 0 bits flipped File:Bitrot in JPEG files, 1 bit flipped.jpg, 1 bit flipped File:Bitrot in JPEG files, 2 bits flipped.jpg, 2 bits flipped File:Bitrot in JPEG files, 3 bits flipped.jpg, 3 bits flipped


Primary storages

Data degradation in dynamic random-access memory (DRAM) can occur when the
electric charge Electric charge is the physical property of matter that causes charged matter to experience a force when placed in an electromagnetic field. Electric charge can be ''positive'' or ''negative'' (commonly carried by protons and electrons respe ...
of a
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
in DRAM disperses, possibly altering program code or stored data. DRAM may be altered by
cosmic ray Cosmic rays are high-energy particles or clusters of particles (primarily represented by protons or atomic nuclei) that move through space at nearly the speed of light. They originate from the Sun, from outside of the Solar System in our own ...
s or other high-energy particles. Such data degradation is known as a soft error.
ECC memory Error correction code memory (ECC memory) is a type of computer data storage that uses an error correction code (ECC) to detect and correct n-bit data corruption which occurs in memory. ECC memory is used in most computers where data corruption c ...
can be used to mitigate this type of data degradation.


Secondary storages

Data degradation results from the gradual decay of
storage media Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are consi ...
over the course of years or longer. Causes vary by medium: * '' Solid-state media'', such as
EPROM An EPROM (rarely EROM), or erasable programmable read-only memory, is a type of programmable read-only memory (PROM) chip that retains its data when its power supply is switched off. Computer memory that can retrieve stored data after a power s ...
s, flash memory and other
solid-state drive A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is a ...
s, store data using electrical charges, which can slowly leak away due to imperfect insulation. The chip itself is not affected by this, so reprogramming it approximately once per decade prevents decay. An undamaged copy of the master data is required for the reprogramming. * ''
Magnetic media Magnetic storage or magnetic recording is the storage of data on a magnetized medium. Magnetic storage uses different patterns of magnetisation in a magnetizable material to store data and is a form of non-volatile memory. The information is ac ...
'', such as
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magne ...
s, floppy disks and magnetic tapes, may experience data decay as bits lose their magnetic orientation. Periodic refreshing by rewriting the data can alleviate this problem. In warm/humid conditions these media, especially those poorly protected against ambient air, are prone to the physical
decomposition Decomposition or rot is the process by which dead organic substances are broken down into simpler organic or inorganic matter such as carbon dioxide, water, simple sugars and mineral salts. The process is a part of the nutrient cycle and is e ...
of the storage medium. * ''
Optical media In computing and optical disc recording technologies, an optical disc (OD) is a flat, usually circular disc that encodes binary data ( bits) in the form of pits and lands on a special material, often aluminum, on one of its flat surfaces ...
'', such as
CD-R CD-R (Compact disc-recordable) is a digital optical disc storage format. A CD-R disc is a compact disc that can be written once and read arbitrarily many times. CD-R discs (CD-Rs) are readable by most CD readers manufactured prior to the i ...
, DVD-R and BD-R, may experience data decay from the
breakdown Breakdown may refer to: Breaking down *Breakdown (vehicle), failure of a motor vehicle in such a way that it cannot be operated *Chemical decomposition, also called chemical breakdown, the breakdown of a substance into simpler components *Decompo ...
of the storage medium. This can be mitigated by storing discs in a dark, cool, low humidity location. "Archival quality" discs are available with an extended lifetime, but are still not permanent. However, data integrity scanning that measures the rates of various types of errors is able to predict data decay on optical media well ahead of uncorrectable data loss occurring. * '' Paper media'', such as
punched cards A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
and
punched tape Five- and eight-hole punched paper tape Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop Punched tape or perforated paper tape is a form of data storage ...
, may literally rot.
Mylar BoPET (biaxially-oriented polyethylene terephthalate) is a polyester film made from stretched polyethylene terephthalate (PET) and is used for its high tensile strength, chemical and dimensional stability, transparency, reflectivity, gas and a ...
punched tape is another approach that does not rely on electromagnetic stability.


Hardware failures

Most disk,
disk controller {{unreferenced, date=May 2010 The disk controller is the controller circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive. It also provides an interface between the disk drive and the bus conne ...
and higher-level systems are subject to a slight chance of unrecoverable failure. With ever-growing disk capacities, file sizes, and increases in the amount of data stored on a disk, the likelihood of the occurrence of data decay and other forms of uncorrected and undetected
data corruption In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
increases. Low-level disk controllers typically employ
error correction code In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
s (ECC) to correct erroneous data. Higher-level software systems may be employed to mitigate the risk of such underlying failures by increasing redundancy and implementing integrity checking, error correction codes and self-repairing algorithms. The
ZFS ZFS (previously: Zettabyte File System) is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an ope ...
file system was designed to address many of these data corruption issues. The
Btrfs Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or simply by spelling it out) is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (not to be confused ...
file system also includes data protection and recovery mechanisms, as does ReFS.


See also

*
Checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data ...
* Database integrity *
Data curation Data curation is the organization and integration of data collected from various sources. It involves annotation, publication and presentation of the data such that the value of the data is maintained over time, and the data remains available for re ...
*
Data preservation Data preservation is the act of conserving and maintaining both the safety and integrity of data. Preservation is done through formal activities that are governed by policies, regulations and strategies directed towards protecting and prolonging th ...
*
Data scrubbing Data scrubbing is an error correction technique that uses a background task to periodically inspect main memory or storage for errors, then corrects detected errors using redundant data in the form of different checksums or copies of data. Data ...
* Digital permanence *
Digital preservation In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and ...
*
Disc rot Disc rot is the tendency of CD, DVD, or other optical discs to become unreadable because of physical or chemical deterioration. The causes include oxidation of the reflective layer, physical scuffing and abrasion of disc, reactions with contamina ...
* Error detection and correction *
Link rot Link rot (also called link death, link breaking, or reference rot) is the phenomenon of hyperlinks tending over time to cease to point to their originally targeted file, web page, or server due to that resource being relocated to a new address ...
*
Media preservation Preservation of documents, pictures, recordings, digital content, etc., is a major aspect of archival science. It is also an important consideration for people who are creating time capsules, family history, historical documents, scrapbooks and fam ...
* RAR archive file format has optional recovery * PAR2 recovery file format


References

{{Data Computer jargon Data quality