Big Data

I have music in the DNA

27 April 2018 | Written by Andrea Geremicca

Soon we could save music, photos, word files, a copy of the passport and lots of other data directly on the DNA

In the last two years we have produced a quantity of data so high as to give rise to a big problem: now where do we store them? An incredible data emerges from an IBM study: 90% of the data produced (and stored) in the entire history of humanity has been created in the last two years. At this time we manufacture about 2.5 quintillions (2.5 billion trillions, or 2.5 X 10 high to 30) of data bytes every day and according to IBM this is only the beginning, in the next few years, the growth will still be higher, thanks to the so-called IoT (internet of things).
Our Hard Drives, although increasingly capacious, will not have an easy life in the future. The speed with which we create data and the speed with which we improve the technologies capable of storing them (Hard Disk), in fact, are going at different ways now.
It is, therefore, more than plausible to think that in the future we will no longer save our photos on a normal Hard Disk but we will have to find an alternative archive. This archive could already exist, it is DNA. Yes exact, just that DNA.

Already in 2012, researchers at Harvard began studying DNA as a possible data storage tool, saving 52,000 books in a single piece of DNA. The method was conceptually simple, it was necessary to transform the code of life (the DNA code consists of an alternation of nitrogenous bases: Adenine, Guanine, Timina and Cytosine) in 0 and 1, the binary code of which computers are fed. However, the success of this experiment was partial. The method used, in fact, revealed some limits, both due to the limited storage capacity that can be reached and from the point of view of the completeness of the information. But it was just the beginning.
This discovery generated many reflections, especially of a medical nature. The researchers imagined, in fact, that in the future, thanks to this technology, we would be able to observe the growth, development and death of a cell, as if it were equipped with a sort of “black box of aircraft”. Many other companies, such as Microsoft, instead saw in the DNA something different from the medical purpose and began to evaluate the various business opportunities, including, in fact, that of the hard disk of the future.

In 2016, some Harvard researchers managed to save a GIF in the DNA of a living organism and subsequently transfer it to a bacterium. Using CRISPR – CAS9 technology, each pixel in the image has been converted into nucleotides (nucleotides are repeating units of DNA and RNA) and stored in DNA. From there the double-helix molecule was transported to another bacterium and sequenced again: the two images were equal to 90%. In practice, they had saved a GIF in a DNA folder, brought that folder to another part (in an E.Coli bacterium) and from there reopened the same GIF, finding it equal to the original 90%.

A few days ago came a news that further clarifies that it is not science fiction but something concrete: The Massive Attack, a British group, has decided, on the occasion of the twentieth anniversary of their historic album Mezzanine, to make it immortal by saving it in DNA. The whole album has been optimized with a special algorithm and therefore weighs only 15MB, not much for an MP3 player, but a lot of what has been done up to now with DNA. In fact, it would be the second largest file ever saved in the double-helix molecule.

At this point, the question is: how far are we from using DNA like a USB stick? It’s hard to say it exactly, it will depend on how much investment will be made in this direction and the price of DNA sequencing, but with this rate of technological acceleration, it could happen very soon.

Thanks to a new technique, called DNA Fountain, we are now able to exploit almost the maximum (about 85%) of the theoretical total capacity of DNA storage and above all without any error. In fact, we can save about 215 million Gigabytes in a single gram of DNA, with obvious advantages: small size and a life of our data that is around hundreds of thousands of years.
At the moment the cost to synthesize 2MB of data is around 7000 dollars and at least others 2000 are needed to read what we have saved. But if we look at how much the price for DNA sequencing has come down since 2001 ($ 100 million) to date ($ 700), it is not difficult to think that these prices are destined to fall rapidly in the future. The code of my generation is made of 0 and 1, that of my children could be made of nitrogenous bases and sugars.

Andrea Geremicca
Andrea Geremicca


Since 2014 he has been part of the Organizing team of TEDx Roma and a visiting professor and mentor at John Cabot University. Andrea writes in his articles the impacts of exponential technologies on our society.

read more