HOME

TheInfoList



OR:

Data degradation is the gradual
corruption Corruption is a form of dishonesty or a criminal offense which is undertaken by a person or an organization which is entrusted in a position of authority, in order to acquire illicit benefits or abuse power for one's personal gain. Corruption m ...
of computer data due to an accumulation of non-critical failures in a data storage device. The phenomenon is also known as data decay, data rot or bit rot.


Example

Below are several digital images illustrating data degradation, all consisting of 326,272 bits. The original photo is displayed first. In the next image, a single bit was changed from 0 to 1. In the next two images, two and three bits were flipped. On Linux systems, the binary difference between files can be revealed using command (e.g. ). File:Bitrot in JPEG files, 0 bits flipped.jpg, 0 bits flipped File:Bitrot in JPEG files, 1 bit flipped.jpg, 1 bit flipped File:Bitrot in JPEG files, 2 bits flipped.jpg, 2 bits flipped File:Bitrot in JPEG files, 3 bits flipped.jpg, 3 bits flipped


Primary storages

Data degradation in dynamic random-access memory (DRAM) can occur when the electric charge of a
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented ...
in DRAM disperses, possibly altering program code or stored data. DRAM may be altered by cosmic rays or other high-energy particles. Such data degradation is known as a
soft error In electronics and computing, a soft error is a type of error where a signal or datum is wrong. Errors may be caused by a defect, usually understood either to be a mistake in design or construction, or a broken component. A soft error is also a ...
.
ECC memory Error correction code memory (ECC memory) is a type of computer data storage that uses an error correction code (ECC) to detect and correct n-bit data corruption which occurs in memory. ECC memory is used in most computers where data corruption ...
can be used to mitigate this type of data degradation.


Secondary storages

Data degradation results from the gradual decay of
storage media Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are conside ...
over the course of years or longer. Causes vary by medium: * '' Solid-state media'', such as EPROMs, flash memory and other solid-state drives, store data using electrical charges, which can slowly leak away due to imperfect insulation. The chip itself is not affected by this, so reprogramming it approximately once per decade prevents decay. An undamaged copy of the master data is required for the reprogramming. * '' Magnetic media'', such as hard disk drives, floppy disks and magnetic tapes, may experience data decay as bits lose their magnetic orientation. Periodic refreshing by rewriting the data can alleviate this problem. In warm/humid conditions these media, especially those poorly protected against ambient air, are prone to the physical decomposition of the storage medium. * ''
Optical media In computing and optical disc recording technologies, an optical disc (OD) is a flat, usually circular disc that encodes binary data (bits) in the form of pits and lands on a special material, often aluminum, on one of its flat surfaces. ...
'', such as
CD-R CD-R (Compact disc-recordable) is a digital optical disc storage format. A CD-R disc is a compact disc that can be written once and read arbitrarily many times. CD-R discs (CD-Rs) are readable by most CD readers manufactured prior to the in ...
,
DVD-R DVD recordable and DVD rewritable are optical disc recording technologies. Both terms describe DVD optical discs that can be written to by a DVD recorder, whereas only 'rewritable' discs are able to erase and rewrite data. Data is written ('bur ...
and
BD-R Blu-ray Disc Recordable (BD-R) refers to two direct to disc optical disc recording technologies that can be recorded on to a Blu-ray-based optical disc with an optical disc recorder. BD-R discs can be written to once, whereas Blu-ray Disc Recorda ...
, may experience data decay from the breakdown of the storage medium. This can be mitigated by storing discs in a dark, cool, low humidity location. "Archival quality" discs are available with an extended lifetime, but are still not permanent. However, data integrity scanning that measures the rates of various types of errors is able to predict data decay on optical media well ahead of uncorrectable data loss occurring. * '' Paper media'', such as
punched cards A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
and punched tape, may literally rot.
Mylar BoPET (biaxially-oriented polyethylene terephthalate) is a polyester film made from stretched polyethylene terephthalate (PET) and is used for its high tensile strength, chemical and dimensional stability, transparency, reflectivity, gas and ...
punched tape is another approach that does not rely on electromagnetic stability.


Hardware failures

Most disk,
disk controller {{unreferenced, date=May 2010 The disk controller is the controller circuit which enables the CPU to communicate with a hard disk, floppy disk or other kind of disk drive. It also provides an interface between the disk drive and the bus conne ...
and higher-level systems are subject to a slight chance of unrecoverable failure. With ever-growing disk capacities, file sizes, and increases in the amount of data stored on a disk, the likelihood of the occurrence of data decay and other forms of uncorrected and undetected
data corruption In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
increases. Low-level disk controllers typically employ
error correction code In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
s (ECC) to correct erroneous data. Higher-level software systems may be employed to mitigate the risk of such underlying failures by increasing redundancy and implementing integrity checking, error correction codes and self-repairing algorithms. The ZFS file system was designed to address many of these data corruption issues. The
Btrfs Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or simply by spelling it out) is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (not to be confused w ...
file system also includes data protection and recovery mechanisms, as does ReFS.


See also

*
Checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data ...
* Database integrity * Data curation *
Data preservation Data preservation is the act of conserving and maintaining both the safety and integrity of data. Preservation is done through formal activities that are governed by policies, regulations and strategies directed towards protecting and prolonging th ...
*
Data scrubbing Data scrubbing is an error correction technique that uses a background task to periodically inspect main memory or storage for errors, then corrects detected errors using redundant data in the form of different checksums or copies of data. Data ...
*
Digital permanence Digital permanence addresses the history and development of digital storage techniques, specifically quantifying the expected lifetime of data stored on various digital media and the factors which influence the ''permanence'' of digital data. It is ...
* Digital preservation * Disc rot *
Error detection and correction In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communi ...
* Link rot * Media preservation * RAR archive file format has optional recovery * PAR2 recovery file format


References

{{Data Computer jargon Data quality