Dd (Unix)
   HOME

TheInfoList



OR:

dd is a
command-line A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
utility for
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
, Plan 9,
Inferno Inferno may refer to: * Hell, an afterlife place of suffering * Conflagration, a large uncontrolled fire Film * ''L'Inferno'', a 1911 Italian film * Inferno (1953 film), ''Inferno'' (1953 film), a film noir by Roy Ward Baker * Inferno (1973 fi ...
, and
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
s and beyond, the primary purpose of which is to convert and copy files. On Unix, device drivers for hardware (such as
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnet ...
s) and special
device file In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow a ...
s (such as
/dev/zero is a special file in Unix-like operating systems that provides as many null characters (ASCII NUL, 0x00) as are read from it. One of the typical uses is to provide a character stream for initializing data storage. Function Read operations from ...
and /dev/random) appear in the file system just like normal files; can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The program can also perform conversions on the data as it is copied, including
byte order In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most sig ...
swapping and conversion to and from the
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
and
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six- ...
text encodings.


History

The name is an allusion to the DD statement found in IBM's
Job Control Language Job Control Language (JCL) is a name for scripting languages used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem. More specifically, the purpose of JCL is to say which programs to run, ...
(JCL), in which it is an abbreviation for "Data Definition". The command's syntax resembles a JCL statement more than other Unix commands do, so much that
Eric S. Raymond Eric Steven Raymond (born December 4, 1957), often referred to as ESR, is an American software developer, open-source software advocate, and author of the 1997 essay and 1999 book ''The Cathedral and the Bazaar''. He wrote a guidebook for the ...
says "the interface design was clearly a prank". The interface is redesigned in Plan 9's dd command to use a command-line option style. dd is sometimes humorously called "Disk Destroyer", due to its drive-erasing capabilities. Originally intended to convert between
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
and
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six- ...
, first appeared in
Version 5 Unix The term "Research Unix" refers to early versions of the Unix operating system for PDP-7, DEC PDP-7, PDP-11, VAX and Interdata 7/32 and 8/32 computers, developed in the Bell Labs Computing Sciences Research Center (CSRC). History The term ''Re ...
. The command is specified since the
X/Open X/Open group (also known as the Open Group for Unix Systems and incorporated in 1987 as X/Open Company, Ltd.) was a consortium founded by several European UNIX systems manufacturers in 1984 to identify and promote open standards in the field of info ...
Portability Guide issue 2 of 1987. This is inherited by
IEEE The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operation ...
Std 1003.1-2008 (
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming interf ...
), which is part of the Single UNIX Specification. The version of dd bundled in
GNU GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
coreutils The GNU Core Utilities or coreutils is a package of GNU software containing implementations for many of the basic tools, such as cat, ls, and rm, which are used on Unix-like operating systems. In September 2002, the ''GNU coreutils'' were cr ...
was written by Paul Rubin, David MacKenzie, and Stuart Kemp. The command is available as a separate package for
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
as part of the
UnxUtils UnxUtils is a collection of ports of common GNU Unix-like utilities to native Win32, with executables only depending on the Microsoft C- runtime msvcrt.dll. The collection was last updated externally on April 15, 2003, by Karl M. Syring. The mo ...
collection of
native Native may refer to: People * Jus soli, citizenship by right of birth * Indigenous peoples, peoples with a set of specific rights based on their historical ties to a particular territory ** Native Americans (disambiguation) In arts and entert ...
Win32 The Windows API, informally WinAPI, is Microsoft's core set of application programming interfaces (APIs) available in the Microsoft Windows operating systems. The name Windows API collectively refers to several different platform implementations th ...
ports A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as H ...
of common GNU Unix-like utilities.


Usage

The
command line A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
syntax of differs from many other Unix programs. It uses the syntax for its
command-line option A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
s rather than the more standard or formats. By default, reads from
stdin In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
and writes to
stdout In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
, but these can be changed by using the (input file) and (output file) options. Certain features of will depend on the computer system capabilities, such as 's ability to implement an option for direct memory access. Sending a SIGINFO signal (or a USR1 signal on Linux) to a running process makes it print I/O statistics to
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
once and then continue copying. can read
standard input In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
from the keyboard. When
end-of-file In computing, end-of-file (EOF) is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream. Details In the C standard library, the character reading functio ...
(EOF) is reached, will exit. Signals and EOF are determined by the software. For example, Unix tools ported to
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
vary as to the EOF:
Cygwin Cygwin ( ) is a POSIX-compatible programming and runtime environment that runs natively on Microsoft Windows. Under Cygwin, source code designed for Unix-like operating systems may be compiled with minimal modification and executed. The Cygwin in ...
uses (the usual Unix EOF) and
MKS Toolkit MKS Toolkit is a software package produced and maintained by PTC that provides a Unix-like environment for scripting, connectivity and porting Unix and Linux software to Microsoft Windows. It was originally created for MS-DOS, and OS/2 versions ...
uses (the usual Windows EOF). The non-standardized parts of dd invocation vary among implementations.


Output messages

On completion, prints to the
stderr In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
stream about statistics of the data transfer. The format is standardized in POSIX. The manual page for GNU dd does not describe this format, but the BSD manuals do. Each of the "Records in" and "Records out" lines shows the number of complete blocks transferred + the number of partial blocks, e.g. because the physical medium ended before a complete block was read, or a physical error prevented reading the complete block.


Block size

A
block Block or blocked may refer to: Arts, entertainment and media Broadcasting * Block programming, the result of a programming strategy in broadcasting * W242BX, a radio station licensed to Greenville, South Carolina, United States known as ''96.3 ...
is a unit measuring the number of
byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s that are read, written, or converted at one time. Command-line options can specify a different block size for input/reading () compared to output/writing (), though the block size () option will override both and . The default value for both input and output block sizes is 512 bytes (the traditional block size of disks, and POSIX-mandated size of "a block"). The option for copying is measured in blocks, as are both the count for reading and count for writing. Conversion operations are also affected by the "conversion block size" (). The value provided for block size options is interpreted as a decimal (base 10) integer number of bytes. It can also contain suffixes to indicate that the block size is an integer number of larger units than bytes. POSIX only specifies the suffixes (blocks) for 512 and (
kibibytes The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
) for 1024. Implementation differ on the additional suffixes they support: (Free) BSD uses lowercase (
mebibytes The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
), (
gibibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s), and so on for
tebibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s,
exbibyte The byte is a unit of digital information that most commonly consists of eight bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a ...
s,
pebibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
s,
zebibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s, and
yobibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
s, while GNU uses and for the same units, with , , and used for their
SI unit The International System of Units, known by the international abbreviation SI in all languages and sometimes pleonastically as the SI system, is the modern form of the metric system and the world's most widely used system of measurement. E ...
counterparts (
kilobytes The kilobyte is a multiple of the unit byte for digital information. The International System of Units (SI) defines the prefix '' kilo'' as 1000 (103); per this definition, one kilobyte is 1000 bytes.International Standard IEC 80000-13 Quantiti ...
). For example, for GNU , indicates a blocksize of 16 mebibytes (16777216 bytes) and specifies 3000 bytes. Additionally, some implementations understand the character as a multiplication operator for both block size and count parameters. For example, is interpreted as 2 × 80 × 18 × 512 = , the exact size of a 1440 KiB
floppy disk A floppy disk or floppy diskette (casually referred to as a floppy, or a diskette) is an obsolescent type of disk storage composed of a thin and flexible disk of a magnetic storage medium in a square or nearly square plastic enclosure lined w ...
. This is required in POSIX, but GNU does not seem to support it. As a result, it is more portable to use the
POSIX shell A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to ...
arithmetic syntax of bs=$((2*80*18))b. Block size has an effect on the performance of copying commands. Doing many small reads or writes is often slower than doing fewer large ones. Using large blocks requires more RAM and can complicate error recovery. When is used with variable-block-size devices such as tape drives or networks, the block size may determine the tape record size or
packet Packet may refer to: * A small container or pouch ** Packet (container), a small single use container ** Cigarette packet ** Sugar packet * Network packet, a formatted unit of data carried by a packet-mode computer network * Packet radio, a form ...
size, depending on the
network protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics and synchroniza ...
used.


Uses

The command can be used for a variety of purposes. For plain-copying commands it tends to be slower than the domain-specific alternatives, but it excels at its unique ability to "overwrite or truncate a file at any point or seek in a file", a fairly low-level interface to the Unix file API. The examples below assume the use of GNU dd, mainly in the block size argument. To make them portable, replace e.g. with the shell arithmetic expression or (written equivalently with a
bit shift In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operati ...
).


Data transfer

can duplicate data across files, devices, partitions and volumes. The data may be input or output to and from any of these; but there are important differences concerning the output when going to a partition. Also, during the transfer, the data can be modified using the options to suit the medium. (For this purpose, however, is slower than .) The option means to keep going if there is an error, while the option causes output blocks to be padded.


In-place modification

can modify data in place. For example, this overwrites the first 512 bytes of a file with null bytes: The conversion option means do not truncate the output file — that is, if the output file already exists, just replace the specified bytes and leave the rest of the output file alone. Without this option, would create an output file 512 bytes long.


Master boot record backup and restore

The example above can also be used to back up and restore any region of a device to a file, such as a master boot record. To duplicate the first two sectors of a floppy disk:


Disk wipe

For security reasons, it is sometimes necessary to have a disk wipe of a discarded device. This can be achieved by a "data transfer" from the Unix special files. * To write zeros to a disk, use dd if=
/dev/zero is a special file in Unix-like operating systems that provides as many null characters (ASCII NUL, 0x00) as are read from it. One of the typical uses is to provide a character stream for initializing data storage. Function Read operations from ...
of= /dev/sda bs=16M
. * To write random data to a disk, use dd if= /dev/urandom of= /dev/sda bs=16M. When compared to the data modification example above, conversion option is not required as it has no effect when the 's output file is a block device. The option makes dd read and write 16 
mebibytes The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
at a time. For modern systems, an even greater block size may be faster. Note that filling the drive with random data may take longer than zeroing the drive, because the random data must be created by the CPU, while creating zeroes is very fast. On modern hard-disk drives, zeroing the drive will render most data it contains permanently irrecoverable. However, with other kinds of drives such as flash memories, much data may still be recoverable by
data remanence Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting o ...
. Modern
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magnet ...
s contain a Secure Erase command designed to permanently and securely erase every accessible and inaccessible portion of a drive. It may also work for some
solid-state drive A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is ...
s (flash drives). As of 2017, it does not work on
USB flash drive A USB flash drive (also called a thumb drive) is a data storage device that includes flash memory with an integrated USB interface. It is typically removable, rewritable and much smaller than an optical disc. Most weigh less than . Since firs ...
s nor on
Secure Digital Secure Digital, officially abbreviated as SD, is a proprietary format, proprietary non-volatile memory, non-volatile Flash memory, flash memory card format developed by the SD Association, SD Association (SDA) for use in portable devices. The s ...
flash memories. When available, this is both faster than using dd, and more secure. On
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
machines it is accessible via the
hdparm hdparm is a command line program for Linux to set and view ATA hard disk drive hardware parameters and test performance. It can set parameters such as drive caches, sleep mode, power management, acoustic management, and DMA settings. GParted ...
command's option. The shred program offers multiple overwrites, as well as more secure deletion of individual files.


Data recovery

Data recovery In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The dat ...
involves reading from a drive with some parts potentially inaccessible. is a good fit with this job with its flexible skipping () and other low-level settings. The vanilla , however, is clumsy to use as the user has to read the error messages and manually calculate the regions that can be read. The single block size also limits the granuarity of the recovery, as a trade-off has to be made: either use a small one for more data recovered or use a large one for speed. A C program called was written in October 1999. It did away with the conversion functionality of , and supports two block sizes to deal with the dilemma. If a read using a large size fails, it falls back to the smaller size to gather as much as data possible. It can also run backwards. In 2003, a script was written to automate the process of using , keeping track of what areas have been read on its own. In 2004, GNU wrote a separate utility, unrelated to , called . It has a more sophisticated dynamic block-size algorithm and keeps track of what has been read internally. The authors of both and consider it superior to their implementation. To help distinguish the newer GNU program from the older script, alternate names are sometimes used for GNU's , including (the name on freecode.com and freshmeat.net), (
Debian Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
package name), and (
openSUSE openSUSE () is a free and open-source software, free and open source RPM Package Manager, RPM-based Linux distribution developed by the openSUSE project. The initial release of the community project was a beta version of SUSE Linux 10.0. Addi ...
package name). Another open-source program called uses a sophisticated algorithm, but it also requires the installation of its own programming-language interpreter.


Benchmarking drive performance

To make drive benchmark test and analyze the sequential (and usually single-threaded) system read and write performance for 1024-byte blocks: * Write performance: dd if=
/dev/zero is a special file in Unix-like operating systems that provides as many null characters (ASCII NUL, 0x00) as are read from it. One of the typical uses is to provide a character stream for initializing data storage. Function Read operations from ...
bs=1024 count=1000000 of=1GB_file_to_write
* Read performance: dd if=1GB_file_to_read of=
/dev/null In some operating systems, the null device is a device file that discards all data written to it but reports that the write operation succeeded. This device is called /dev/null on Unix and Unix-like systems, NUL: (see TOPS-20) or NUL on CP/M an ...
bs=1024


Generating a file with random data

To make a file of 100 random bytes using the kernel random driver:


Converting a file to upper case

To convert a file to uppercase:


Progress indicator

Being a program mainly designed as a filter, normally does not provide any progress indication. This can be overcome by sending an signal to the running GNU process ( on BSD systems), resulting in printing the current number of transferred blocks. The following one-liner results in continuous output of progress every 10 seconds until the transfer is finished, when is replaced by the process-id of : Newer versions of GNU support the option, which enables periodic printing of transfer statistics to stderr.


Forks


dcfldd

' is a
fork In cutlery or kitchenware, a fork (from la, furca 'pitchfork') is a utensil, now usually made of metal, whose long handle terminates in a head that branches into several narrow and often slightly curved tines with which one can spear foods ei ...
of GNU that is an enhanced version developed by Nick Harbour, who at the time was working for the United States' Department of Defense Computer Forensics Lab. Compared to , allows more than one output file, supports simultaneous multiple checksum calculations, provides a verification mode for file matching, and can display the percentage progress of an operation. The last release was in 2021.


dc3dd

is another enhanced GNU from the United States Department of Defense Cyber Crime Center (DC3). It can be seen as a continuation of the dcfldd, with a stated aim of updating whenever the GNU upstream is updated. Its last release was in 2018.


See also

*
Backup In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", w ...
*
Disk cloning Disk cloning is the process of creating a 1-to-1 copy of a hard disk drive (HDD) or solid-state drive (SSD), not just its files. Disk cloning may be used for upgrading a disk or replacing an aging disk with a fresh one. In this case, the clone can r ...
*
Disk Copy Disk Copy was the default utility for handling logical volume images in System 7 (Macintosh), System 7 through Mac OS X 10.2 (usable in System Software 6 as well). In later versions of macOS it has been replaced by DiskImageMounter for mounting ...
*
Disk image A disk image, in computing, is a computer file containing the contents and structure of a disk volume or of an entire data storage device, such as a hard disk drive, tape drive, floppy disk, optical disc, or USB flash drive. A disk image is us ...
*
.img img or IMG is an abbreviation for image. img or IMG may also refer to: * IMG (company), global sports and media business headquartered in New York City but with its main offices in Cleveland, originally known as the "International Management Group ...
(filename extension) *
List of Unix commands This is a list of Unix commands as specified by IEEE Std 1003.1-2008, which is part of the Single UNIX Specification (SUS). These commands can be found on Unix operating systems and most Unix-like operating systems. List See also * List of G ...
*
ddrescue GNU ddrescue is a data recovery tool for disk drives, DVDs, CDs, and other digital storage media. It copies raw blocks of storage, such as disk sectors, from one device or file to another, while handling read errors in an intelligent manner to ...
a GNU version that copies data from corrupted files


References


External links

* * *
dd
manual page from the
GNU Core Utilities The GNU Core Utilities or coreutils is a package of GNU software containing implementations for many of the basic tools, such as cat, ls, and rm, which are used on Unix-like operating systems. In September 2002, the ''GNU coreutils'' were cr ...
. *
dd for Windows


– save a potentially damaged harddisk partition
Softpanorama dd page

DD at Linux Questions Wiki

Forensics (DD) Dcfldd


– a variant specialized in files that are block devices

– Linux specialized variant for devices that use the SCSI command set {{Backup software Data recovery software Disk cloning Hard disk software Standard Unix programs Unix SUS2008 utilities Plan 9 commands Inferno (operating system) commands Data erasure software