In computer
main memory,
auxiliary storage and
computer bus
In computer architecture, a bus (shortened form of the Latin '' omnibus'', and historically also called data highway or databus) is a communication system that transfers data between components inside a computer, or between computers. This ex ...
es, data redundancy is the existence of data that is additional to the actual data and permits correction of errors in stored or transmitted data. The additional data can simply be a complete copy of the actual data (a type of
repetition code
In coding theory, the repetition code is one of the most basic error-correcting codes. In order to transmit a message over a noisy channel that may corrupt the transmission in a few places, the idea of the repetition code is to just repeat the mess ...
), or only select pieces of data that allow
detection of errors and reconstruction of lost or damaged data up to a certain level.
For example, by including additional data
checksum
A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data ...
s,
ECC memory
Error correction code memory (ECC memory) is a type of computer data storage that uses an error correction code (ECC) to detect and correct n-bit data corruption which occurs in memory. ECC memory is used in most computers where data corruption c ...
is capable of detecting and correcting single-bit errors within each
memory word
In computing, a word is the natural unit of data used by a particular Central processing unit, processor design. A word is a fixed-sized Data (computing), datum handled as a unit by the instruction set or the hardware of the processor. The number ...
, while
RAID 1
In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create ...
combines two
hard disk drive
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnet ...
s (HDDs) into a logical storage unit that allows stored data to survive a complete failure of one drive. Data redundancy can also be used as a measure against
silent data corruption
Silent may mean any of the following:
People with the name
* Silent George, George Stone (outfielder) (1876–1945), American Major League Baseball outfielder and batting champion
* Brandon Silent (born 1973), South African former footballer
* C ...
; for example,
file systems
In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one larg ...
such as
Btrfs
Btrfs (pronounced as "better F S", "butter F S", "b-tree F S", or simply by spelling it out) is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager (not to be confused ...
and
ZFS
ZFS (previously: Zettabyte File System) is a file system with volume management capabilities. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an ope ...
use data and
metadata
Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive metadata – the descriptive ...
checksumming in combination with copies of stored data to detect silent data corruption and repair its effects.
In database systems
While different in nature, data redundancy also occurs in
database systems
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
that have values repeated unnecessarily in one or more records or
fields
Fields may refer to:
Music
* Fields (band), an indie rock band formed in 2006
* Fields (progressive rock band), a progressive rock band formed in 1971
* ''Fields'' (album), an LP by Swedish-based indie rock band Junip (2010)
* "Fields", a song b ...
, within a
table
Table may refer to:
* Table (furniture), a piece of furniture with a flat surface and one or more legs
* Table (landform), a flat area of land
* Table (information), a data arrangement with rows and columns
* Table (database), how the table data ...
, or where the field is replicated/repeated in two or more tables. Often this is found in
unnormalized database designs and results in the complication of database management, introducing the risk of corrupting the data, and increasing the required amount of
storage. When done on purpose from a previously normalized database schema, it ''may'' be considered a form of
database denormalization; used to improve performance of database queries (shorten the database response time).
For instance, when customer data are duplicated and attached with each product bought, then redundancy of data is a known source of
inconsistency since a given customer might appear with different values for one or more of their attributes.
Data redundancy leads to
data anomalies and corruption and generally should be avoided by design;
applying
database normalization
Database normalization or database normalisation (see spelling differences) is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity ...
prevents redundancy and makes the best possible usage of storage.
See also
*
Data maintenance
Data management comprises all disciplines related to handling data as a valuable resource.
Concept
The concept of data management arose in the 1980s as technology moved from sequential processing (first punched cards, then magnetic tape) to ...
*
Data deduplication
*
Data scrubbing
Data scrubbing is an error correction technique that uses a background task to periodically inspect main memory or storage for errors, then corrects detected errors using redundant data in the form of different checksums or copies of data. Data ...
*
End-to-end data protection
End-to-end or End to End may refer to:
* End-to-end auditable voting systems, a voting system
* End-to-end delay, the time for a packet to be transmitted across a network from source to destination
* End-to-end encryption, a cryptographic paradigm ...
*
Redundancy (engineering)
*
Redundancy (information theory)
In information theory, redundancy measures the fractional difference between the entropy of an ensemble , and its maximum possible value \log(, \mathcal_X, ). Informally, it is the amount of wasted "space" used to transmit certain data. Data comp ...
References
{{DEFAULTSORT:Data Redundancy
Computer memory
Data
Data modeling
Databases
Fault-tolerant computer systems