DNA Data Storage: Myths and Realities

Recently, impressive research results on DNA data storage have been published in Nature. Although DNA exists in nature, it can also be synthesized in the form of oligonucleotides in vitro, so there have been several studies to store information in DNA. The difference between this study and previous studies is that it has succeeded in storing and retrieving very short videos in live bacteria using the latest gene editing technology CRISPR-Cas system. The lower left is the original image and the right is the image taken from the E. coli DNA, indicating that the data content is almost complete. The data recovery rate is about 90%.

With this wonderful result being released, interest in the area of DNA computing and data storage is growing. Some say that DNA computing is the future, and commercialization is on the horizon. However, there are many obstacles to overcome in order for this advanced technology to be actually commercialized. I personally think DNA computing and data storage has great advantages. Excessive expectations, however, always have great side effects. So, I want to deal with the history, advantages, disadvantages and possibility of realization of DNA computing and data storage.

DNA as Natural Information Processing Machines

The story of RNA and DNA, the key substances that retain life's information and evolve through mutation and natural selection processes, have been discussed in several posts on this blog. The possibility of DNA as a substance to store information has been demonstrated in evolutionary processes of billions of years, but it is also excellent in terms of storage capacity and longevity.

Computer scientists have developed software algorithms that focus on the characteristics of these DNA relatively early, but attempts to use the DNA itself as a direct material are said to have begun in 1994 by Leonard Adleman of the University of Southern California. Adleman has succeeded in solving the seven-point Hamiltonian path problem using DNA, and has since laid the foundations for research into the possibility of making various types of Turing machines based on DNA.

Research on the use of DNA as a data storage can be seen as a beginning of meaningful performance for a device created in 2007 at the University of Arizona. Nevertheless, the most important progress has also been made by the George Church research team at Harvard University. They published very important paper "Next-Generation Digital Information Storage in DNA" Science in 2012 (Yes, same team with above research). They proved the practical possibilities of DNA data storage by saving an HTML draft book with 53,400 words, 11 JPG images, and a JavaScrip program in DNA, copying it, and then reading it with very few errors.

Stability, Capacity and Economics

However, doubts about the high cost and stability of stored DNA data have led many to believe that such studies will be difficult to commercialize.

Research on the long-term stability of data encoded in DNA has made great strides in ETH Zurich in 2015. They encapsulated the DNA in silica glass spheres and reported error-free information recovery up to 1 million years at -18 ° C and 2000 years if stored at 10 ° C. Boise State University researchers have argued for the possibility of nucleic acid memory (NAM) through theoretical calculations of the stability of DNA materials as follows.

"With information retention times that range from thousands to millions of years, volumetric density 103 times less flash memory and energy of operation 108 times less, we believe that DNA is used as a memory-storage material in NAM products promises A viable and compelling alternative to electronic memory."

In fact, these findings are anticipated because DNA can provide longer-term stability than current mainstream silicon/magnetic-based memories. Then what about capacity? Believe it or not, according to New Scientist, Just 1 gram of DNA is theoretically capable of holding 455 exabytes - enough for all the data held by Google and Facebook with enough rooms.

But, still, it is very expensive. Although the price for DNA synthesis is expected to decline very rapidly, it is still estimated that the cost of encoding 1 megabyte of data into DNA is generally more than $10,000. Therefore, without the breakthrough of technology related to DNA synthesis, the commercialization of DNA data storage of the current method still has a long way to go.

Other Problems

Beyond economics, there are more hurdles to overcome in order to commercialize this technology. One issue is that data processing speed is very slow. Experiments conducted by Microsoft show that it is possible to convert data to DNA at a rate of about 400 bytes per second. This is a very slow speed compared to the current computer storage technology can process them at least hundreds of megabytes per second. This is a fairly fundamental problem, and the solution to this problem may be to speed up through hyper-parallelism. This means that it needs rich parallel redundancy to speed up, which adds to the economics issue further.

Another problem is the controller of DNA data storage. To manipulate DNA, organic material-based editing techniques such as CRISPR-Cas system and organic containers such as bacteria are required. Managing these controllers and containers is a very difficult task. DNA is dense and durable, but these are not. Although DNA is used as a powerful information storage medium through the evolution of the life, it has been designed to be transcribed into mRNA and translated into many proteins in order to have a substantial effect on the life cycle of the life. There is a reason for this.

Time Capsule of Human Civilization

There are many possibilities for DNA data storage or DNA computing. However, it is not a replacement for existing computers. DNA data storage is slow and difficult to manipulate, but once it is recorded, it can store huge amounts of data, can easily be stored for thousands of years, and in some cases can withstand millions of years. Also, reading data is relatively easy and fast.

Is there anything that comes to mind? I think of the Ancient Library of Alexandria, which is said to have been a huge collection of human civilization that disappeared long ago. DNA data storage is probably the most suitable technology for a huge archive or time capsule of human civilization. However, it is still necessary to overcome many obstacles to open the heyday of so-called DNA computing, in which the computing environment changes drastically based on the biological materials.


CRISPR–Cas encoding of a digital movie into the genomes of a population of living bacteria
Next-Generation Digital Information Storage in DNA
Data-Storage for Eternity
Nucleic Acid Memory
Microsoft Reports a Big Leap Forward for DNA Data Storage


Popular Posts