DNA digital data storage refers to any process to store digital data
in the base sequence of
DNA using commercially available
oligonucleotide synthesis machines for storage and
machines for retrieval.
2 See also
4 Further reading
This section relies too much on references to primary sources. Please
improve this section by adding secondary or tertiary sources. (April
2018) (Learn how and when to remove this template message)
Among early examples of
DNA data storage, in 2007 a device was created
at the University of Arizona using addressing molecules to encode
mismatch sites within a
DNA strand. These mismatches were then able to
be read out by performing a restriction digest, thereby recovering the
On August 16, 2012, the journal Science published research by George
Church and colleagues at Harvard University, in which
DNA was encoded
with digital information that included an HTML draft of a 53,400 word
book written by the lead researcher, eleven JPG images and one
petabits can be stored in each cubic millimeter of DNA. The
researchers used a simple code where bits were mapped one-to-one with
bases, which had the shortcoming that it led to long runs of the same
base, the sequencing of which is error-prone. This research result
showed that besides its other functions,
DNA can also be another type
of storage medium such as hard drives and magnetic tapes.
An improved system was reported in the journal Nature in January 2013,
in an article led by researchers from the European Bioinformatics
Institute (EBI) and submitted at around the same time as the paper of
Church and colleagues. Over five million bits of data, appearing as a
speck of dust to researchers, and consisting of text files and audio
files, were successfully stored and then perfectly retrieved and
reproduced. Encoded information consisted of all 154 of Shakespeare's
sonnets, a twenty-six-second audio clip of the "I Have a Dream" speech
by Martin Luther King, the well known paper on the structure of
James Watson and Francis Crick, a photograph of EBI headquarters in
Hinxton, United Kingdom, and a file describing the methods behind
converting the data. All the
DNA files reproduced the information
between 99.99% and 100% accuracy. The main innovations in this
research were the use of an error-correcting encoding scheme to ensure
the extremely low data-loss rate, as well as the idea of encoding the
data in a series of overlapping short oligonucleotides identifiable
through a sequence-based indexing scheme. Also, the sequences of
the individual strands of
DNA overlapped in such a way that each
region of data was repeated four times to avoid errors. Two of these
four strands were constructed backwards, also with the goal of
eliminating errors. The costs per megabyte were estimated at
$12,400 to encode data and $220 for retrieval. However, it was noted
that the exponential decrease in
DNA synthesis and sequencing costs,
if it continues into the future, should make the technology
cost-effective for long-term data storage within about ten years.
The long-term stability of data encoded in
DNA was reported in
February 2015, in an article by researches from ETH Zurich. By adding
Reed–Solomon error correction coding and by
DNA within silica glass spheres via Sol-gel
chemistry, the researchers predict error-free information recovery
after up to 1 million years at -18 °C and 2000 years if stored
at 10 °C.
Also, a group of researchers, led by Boise State University is working
toward a better way to store digital information using nucleic acid
memory (NAM). They suggest that the global flash memory market is
predicted to reach $30.2 billion this year, potentially growing to
$80.3 billion by 2025. They estimated that by 2040, the demand for
global memory will exceed the projected supply of silicon (the raw
material used to store flash memory), and that nucleic acid memory has
a retention time far exceeding electronic memory. They have discussed
the longevity of the
DNA materials through first principle theoretical
calculations that is published as commentary research article.
In March 2017, Dr.
Yaniv Erlich and Dina Zielinski of Columbia
University and the
New York Genome Center published a method known as
DNA Fountain which allows perfect retrieval of information from a
density of 215 petabytes per gram of DNA. The technique approaches the
Shannon capacity of
DNA storage, achieving 85% of the theoretical
limit. Using this method, they were also able to perfectly retrieve an
operating system called KolibriOS, the French movie Arrival of a Train
at La Ciotat, a $50 Amazon gift card, a computer virus, a Pioneer
plaque and a study by Claude Shannon, all with a total of 2.14
megabytes. A process which allows 2.18 × 1015 retrievals using the
DNA sample was also tested, being able to perfectly decode
the data. The method is however not ready for large-scale use, as it
costs $7000 to synthesize 2 megabytes of data and another $2000 to
Plant-based digital data storage
^ Skinner, Gary M.; Visscher, Koen; Mansuripur, Masud (2007-06-01).
"Biocompatible Writing of Data into DNA". Journal of Bionanoscience. 1
(1): 17–21. doi:10.1166/jbns.2007.005.
^ Church, G. M.; Gao, Y.; Kosuri, S. (2012). "Next-Generation Digital
Information Storage in DNA". Science. 337 (6102): 1628.
doi:10.1126/science.1226355. PMID 22903519.
^ a b c Yong, E. (2013). "Synthetic double-helix faithfully stores
Shakespeare's sonnets". Nature. doi:10.1038/nature.2013.12279.
^ a b Goldman, N.; Bertone, P.; Chen, S.; Dessimoz, C.; Leproust, E.
M.; Sipos, B.; Birney, E. (2013). "Towards practical, high-capacity,
low-maintenance information storage in synthesized DNA". Nature. 494
(7435): 77–80. doi:10.1038/nature11875. PMC 3672958 .
^ Grass, R. N.; Heckel, R.; Puddu, M.; Paunescu, D.; Stark, W. J.
(2015). "Robust Chemical Preservation of Digital Information on
Silica with Error-Correcting Codes". Angewandte Chemie International
Edition. 54 (8): 2552. doi:10.1002/anie.201411378.
^ Zhirnov, V.; Zadegan, R. M.; Sandhu, G. S.; Church, G. M.; Hughes,
W. L. (2016). "Nucleic acid memory". Nature Materials. 15 (4):
^ Yong, Ed. "This Speck of
DNA Contains a Movie, a Computer Virus, and
an Amazon Gift Card". The Atlantic. Retrieved 3 March 2017.
DNA could store all of the world's data in one room". Science
Magazine. 2 March 2017. Retrieved 3 March 2017.
^ Erlich, Yaniv; Zielinski, Dina (2 March 2017). "
DNA Fountain enables
a robust and efficient storage architecture". Science. 355 (6328):
950–954. doi:10.1126/science.aaj2038. Retrieved 3 March 2017.
Mardis, E. R. (2008). "Next-Generation
DNA Sequencing Methods". Annual
Review of Genomics and Human Genetics. 9: 387–402.
doi:10.1146/annurev.genom.9.081307.164359. PMID 18576944.
Cole, Adam (January 24, 2013). "Shall I Encode Thee In DNA? Sonnets
Stored On Double Helix?" (Download article and audio is available).
National Public Radio.
Naik, Gautam (January 24, 2013). "Storing Digital Data in DNA". The
Wall Street Journal. New York City: Dow Jones & Company. Retrieved
DNA Sequencing Caught in Deluge of Data. The New York Times
Aron, Jacob (February 15, 2015). "Glassed-in
DNA makes the ultimate
time capsule". New Scientist. Retrieved Februa