You are here
Data storage: the DNA revolution
Two metal capsules, each containing 100 billion copies of the Declaration of the Rights of Man and the Citizen from 1789, and the Declaration of the Rights of Woman and the Female Citizen, drafted by Olympe de Gouges in 1791, have just joined the most precious documents in the Archives nationales. These archives, which are the very first ones preserved in the form of DNA, will enter the famous Armoire de Fer iron chest – the monumental safe built in 1790 – alongside all other French constitutions, Louis XVI’s diary, the standard metre bar and standard kilogram cylinder made of platinum, and Louis XIV’s will. Beyond the symbol is a possible technological revolution: after paper and silicon, will DNA be the next medium for information?
The limits of magneto-optical storage
In 2020, humanity produced 45 zettabytes1 of digital data. This volume should amount to 175 Zb by 2025. Faced with this staggering increase, today’s storage media (optical, magnetic tape, and hard drives) appear to have reached their limits: they are fragile, with a life expectancy of 5-7 years; the energy-hungry data centres housing them now consume nearly 2% of global electricity production; and they take up space, with the ever-growing surface they cover now extending to 167 km2 worldwide.
Yet with the rise of artificial intelligence and big data, the demand for bytes is not likely to decline. “In terms of storing the data we generate, we have been living on credit for a number of years. While today we can store 30% of it, this figure could plummet to 3% in the coming decades, failing a technological breakthrough,” warns Stéphane Lemaire, a researcher at the Laboratory of Computational and Quantitative Biology (LCQB).2 When stored on DNA, all of the world’s intelligence could be contained in a shoebox. This technology thus represents a potential solution for what is known as cold data (approximately 70% of the material generated each year), which is rarely consulted but nevertheless invaluable, such as archives.
The biological path
The idea of using DNA as a medium for storing digital information is not new, since it was suggested in 1959 by the American physicist Richard Feynman, who was awarded the Nobel Prize in Physics in 1965. Yet it was only in 2012 that it became a reality. “Today’s storage technologies are all based on chemical, physical, and mathematical methods. The biological path has not yet been explored.”
Over the last three years, the biologist has been working with Pierre Crozet, associate professor at Sorbonne Université (Paris), to develop a new system named DNA Drive. The idea is to use the mechanisms taken from biology to easily produce and copy data on DNA fragments. Their project, called “The DNA Revolution”3, came to light in 2018. It all began with an article on DNA storage published in Alma Mater, the inter-university student association journal, and now involves historians, philosophers, computer scientists, and archivists.
Challenged by students, Lemaire used his team’s skills in molecular biology to encode information on DNA – but not just any information. By way of proof of concept, they chose two highly symbolic and historical documents: the Declaration of the Rights of Man and the Citizen, and the Declaration of the Rights of Woman and the Female Citizen. Their stated ambition was to solve the question of information storage and longevity using a more environmentally-friendly, economical, and accessible process.
DNA Drive: a bio-inspired, bio-compatible, and bio-protected technology
The process is simple: binary digital data (0 or 1) is transformed into quaternary data (the four nucleotides of DNA: A,T,C, and G, in which A=C=0 and T=G=1, for a code with 1 bit/base). Data conversion is performed by an algorithm that generates DNA sequences in DNA Drive format. “The sequence is then stored – as it is in living beings – in long double helix DNA fragments known as plasmids or chromosomes.” The DNA molecules of the DNA Drive are designed to be manipulated by cells such as bacteria, which can copy or produce the information encoded in this manner.
It is also bio-protected, so that the DNA does not store any biologically significant genetic information. “Finally, the information can be accessed, as it is for oligonucleotides, using a portable DNA sequencer, which today is about the size of a USB key.”
Encoding takes a few days, and decoding a few hours. The DNA Drive aims to be an environmentally-friendly storage solution: durable, ecological, and ultra-compact, it can be conserved for thousands of years in metal capsules safe from water, air, and light, with no energy required.4
While they were at it, in 2021 Lemaire and Crozet created a start-up called Biomemory with the digital entrepreneur Erfane Arwani. “We still have plenty of challenges to meet,” points out Crozet. “We are now working to perfect our technology, using the improvements emerging in both DNA synthesis and sequencing to reduce costs. The goal for DNA Drive is to be viable and usable in data centres by 2030.”
- 1. 1 ZB: one sextillion bytes of data.
- 2. CNRS / Sorbonne Université.
- 3. The project was also led in partnership with Twist Bioscience, an American company specialising in DNA synthesis, and Imagene, a French company specialising in the long-term conservation of DNA.
- 4. Each capsule can store a quantity of DNA equivalent to 5,000 To of digital data.