CNRS News
Published on CNRS News (https://news.cnrs.fr)

Home > Synthetic DNA holds great promise for data storage

Synthetic DNA holds great promise for data storage

You are here
Home [1]
Lire en français [2]
Matter [3]
Digital [4]
Computer science [5]
-A [6] +A [6]
article

Synthetic DNA holds great promise for data storage

10.21.2020, by
Martin Koppe [7]
Reading time: 5 minutes
Rost9/Stock.Adobe.com
The European project OligoArchive is working on proof of concept for data storage on synthetic DNA. While this medium is in theory unrivalled in terms of information density and longevity, it still faces technical limits that need to be overcome.

Two septillion bytes by 2025: the advent of the Internet and of wireless networks has led to a massive accumulation of data. "If we were to store all of today's information on Blu-ray, we would need twenty-three piles of disks stretching to the moon," explains Marc Antonini, a research professor at the Computer Science, Signal Processing, and Systems laboratory (I3S) at Sophia Antipolis (southeastern France).1 A crisis is unfolding, forcing Internet giants to expand the number of data centres, which they build in cold areas due to the enormous cooling problems they generate.

The world's data in a shoe box

The chemistry and molecules of living matter have drawn the interest of various researchers in the quest for better-adapted storage systems. Marc Antonini has focused on DNA, a single gramme of which can theoretically contain up to 455 exabytes of information, or 455 quintillion bytes. All of the world's data would thus fit in a shoe box.

Given the pressing need and the improvement of sequencing techniques, the idea is increasingly appealing. "DNA has the advantage of being extremely compact and resistant to the passage of time," Antonini points out. "We can sequence that of mammoths, which is tens of thousands of years old, whereas systems on hard drives have to be duplicated every five years as a precaution, and those on magnetic tapes every twenty years." DNA could replace these tedious and energy-consuming processes.   

The scientist and his team are working on OligoArchive, a three-year project financed to the tune of €3 million by the European Commission, and which brings together the Institute of Molecular and Cellular Pharmacology (IPMC),2 I3S, the Eurecom Graduate School and Research Centre in Digital Sciences, Imperial College London (UK), and the Irish start-up HelixWorks Technologies Limited. Together they are seeking to develop proof of concept for each stage of DNA storage: synthesising and storing data, and retrieving it as efficiently as possible. The project's goal is to build a DNA disk: a fully functional end-to-end prototype demonstrating that DNA could one day replace current archival storage technologies on magnetic tape.

Hermetic capsules containing synthetic DNA. These capsules can be conserved at room temperature for decades or even longer.
CNRS News
Hermetic capsules containing synthetic DNA. These capsules can be conserved at room temperature for decades or even longer.
Imagene SA, Evry
Imagene SA, Evry
Share
Share
[8] [9] [10]

One of the main stumbling blocks however is price. Whether it is natural or synthetic, DNA consists of sequences of four nucleotides, also known as bases. Storage systems use those as part of a quaternary system, as opposed to the binary system of computers. Yet it costs one dollar today to synthesise two hundred nucleotides, and encoding a single image requires a few thousand of them, which makes it impossible to convert the gigantic mass of data that needs to be dealt with.

Hot and cold data 

Solutions exist to overcome this problem, such as not conserving everything on DNA, and making a distinction between cold and hot data. "Cold data is that which is accessed only rarely, not to say never, such as old digitised photos that have accumulated on the cloud, or administrative archives. This stock grows by 60% each year, while the storage capacity of current systems only improves by 20%, which leads to the construction of even more data centres."

This cold data does not have to be accessible with the same immediacy as items used every day, and is therefore an excellent candidate for alternative forms of storage such as synthetic DNA, because it requires less successive encoding and decoding. "It would be invaluable for the cultural heritage sector, which could easily keep multiple copies of film or museum archives. As the fire at Universal Studios in Hollywood in 2008 unfortunately showed, a number of master tapes were lost because they had not been duplicated."

The OligoArchive team is looking at solutions to reduce costs, such as limiting the number of nucleotides needed to store the same amount of information. As previously mentioned, DNA consists of four different nucleotides called A, C, G, and T. A first simple DNA encoding technique involves attributing two binary numbers to each. A for 0 0, C for 0 1, G for 1 0 and T for 1 1. This is referred to as transcoding.

Circumventing the rules of living matter

However, while the synthetic DNA code generated to represent a piece of digital data contains no genetic information that can be understood by the world of living matter, it remains subject to some of its rules. For example, if a nucleotide is repeated too many times without interruption, its sequencing will experience a number of errors. Transcoding cannot easily manage this or control the length – and hence the cost – of the DNA sequences generated. To mitigate these problems, researchers propose integrating an encoding system directly at the level of digital data compression. The challenge is to create sequences of DNA code that can contain, on average, even more digital data on the same number of nucleotides, thereby reducing the cost of synthesis. The team is also developing algorithms that automatically correct the errors connected to the process of DNA code sequencing during decoding.

Digital images after DNA encoding and synthesis. On the left, sequencing and decoding using a poorly adapted compression solution; on the right, the same operations using the compression solution developed by the OligoArchive project.
CNRS News
Digital images after DNA encoding and synthesis. On the left, sequencing and decoding using a poorly adapted compression solution; on the right, the same operations using the compression solution developed by the OligoArchive project.
Laboratoire I3S
Laboratoire I3S
Share
Share
[8] [11] [10]

"When we speak on the telephone, problems can occur with the encoding channels and the sound becomes choppy, or communication is even cut. The noise introduced by DNA sequencing produces a similar phenomenon. We are therefore striving to make encoding more robust. We would also like to standardise compression systems beyond our study group, and are contributing to the JPEG International Standardization Committee with this goal in mind." The team has given itself three years to provide its first proofs of concept, and to pave the way for the practical use of artificial DNA storage.

Footnotes
  • 1. CNRS / Université Côte d’Azur.
  • 2. idem.

Explore more

Matter
Workers are sorting discarded plastic bottles collected from the garbage heap and preparing them for recycling. © Suvra Kanti Das / ZUMA Press / RÉA
[12]
Article
10/27/2025
Plastic recycling is counter-productive [12]
[13]
Slideshow
07/17/2025
Seeing life in purple [13]
Sand ripples, Maspalomas, Grande Canarie, Spain. © Sopotnicki/Shutterstock
[14]
Article
07/07/2025
The secret of sand patches [14]
[15]
Article
03/26/2025
Space experiments aboard the ISS [15]
© Erwan Amice / LEMAR / CNRS Images
[16]
Slideshow
03/19/2025
Breaking the ice [16]
Computer science
[17]
Article
05/22/2023
Towards environmentally-friendly cryptoassets? [17]
[18]
Article
04/11/2023
Spyware in mobile games [18]
[19]
Article
02/09/2023
When the cloud gets closer [19]
[20]
Article
01/26/2023
New algorithm helps satellites avoid space debris [20]
[21]
Article
10/24/2022
One software, billions of possibilities [21]
DNA
Image du film montrant le requin attaquant le bateau et le héros Martin Brody (Jaws, de S. Spielberg, 1975) © Universal - Zanuck-Brown / Collection ChristopheL
[22]
Article
07/24/2025
Sharks fall prey to "Jaws" [22]
Un husky fait face à un chihuahua © cynoclub / iStock.com by Getty Images
[23]
Article
07/05/2025
The tribulations of the chihuahua in America [23]
Iceland © Robert Hardin / Hemis.fr
[24]
Article
05/31/2025
How aquatic plants changed the face of the Earth [24]
© Frédérique PLAS / CNRS Images
[25]
Article
12/11/2024
Epigenetics in the genes [25]
[26]
Article
11/15/2024
Astrochemistry, inside cosmic kitchens [26]

Keywords

DNA [27] Synthetic DNA [28] Nucleotides [29] Coding [30] OligoArchive [31] Sequencing [32] Storage [33] Data [34] Bits [35] Transcoding [36]

Share this article

[37]
[38]
[8]
[10]

Source URL:https://news.cnrs.fr/articles/synthetic-dna-holds-great-promise-for-data-storage

Links
[1] https://news.cnrs.fr/ [2] https://lejournal.cnrs.fr/articles/stockage-de-donnees-les-promesses-de-ladn-synthetique [3] https://news.cnrs.fr/matter [4] https://news.cnrs.fr/digital [5] https://news.cnrs.fr/computer-science [6] https://news.cnrs.fr/javascript%3A%3B [7] https://news.cnrs.fr/authors/martin-koppe [8] https://twitter.com/intent/tweet?url=https%3A//news.cnrs.fr/print/1644%2F&text=Synthetic DNA holds great promise for data storage [9] http://www.facebook.com/sharer/sharer.php?s=100&p%5Burl%5D=https%3A//news.cnrs.fr/print/1644&p%5Btitle%5D=Synthetic%20DNA%20holds%20great%20promise%20for%20data%20storage&p%5Bimages%5D%5B0%5D=https%3A//news.cnrs.fr/sites/default/files/styles/lightbox-hd/public/assets/images/capsule_adn_72dpi.jpg%3Fitok%3DLOrmD2EI&p%5Bsummary%5D= [10] https://bsky.app/intent/compose?text=Synthetic DNA holds great promise for data storage%0Ahttps%3A//news.cnrs.fr/print/1644 [11] http://www.facebook.com/sharer/sharer.php?s=100&p%5Burl%5D=https%3A//news.cnrs.fr/print/1644&p%5Btitle%5D=Synthetic%20DNA%20holds%20great%20promise%20for%20data%20storage&p%5Bimages%5D%5B0%5D=https%3A//news.cnrs.fr/sites/default/files/styles/lightbox-hd/public/assets/images/images_numeriques.jpg%3Fitok%3D-AIqyCDg&p%5Bsummary%5D= [12] https://news.cnrs.fr/articles/plastic-recycling-is-counter-productive [13] https://news.cnrs.fr/slideshows/seeing-life-in-purple [14] https://news.cnrs.fr/articles/the-secret-of-sand-patches [15] https://news.cnrs.fr/articles/space-experiments-aboard-the-iss [16] https://news.cnrs.fr/slideshows/breaking-the-ice [17] https://news.cnrs.fr/articles/towards-environmentally-friendly-cryptoassets [18] https://news.cnrs.fr/articles/spyware-in-mobile-games [19] https://news.cnrs.fr/articles/when-the-cloud-gets-closer [20] https://news.cnrs.fr/articles/new-algorithm-helps-satellites-avoid-space-debris [21] https://news.cnrs.fr/articles/one-software-billions-of-possibilities [22] https://news.cnrs.fr/articles/sharks-fall-prey-to-jaws [23] https://news.cnrs.fr/articles/the-tribulations-of-the-chihuahua-in-america [24] https://news.cnrs.fr/articles/how-aquatic-plants-changed-the-face-of-the-earth [25] https://news.cnrs.fr/articles/epigenetics-in-the-genes [26] https://news.cnrs.fr/articles/astrochemistry-inside-cosmic-kitchens [27] https://news.cnrs.fr/dna [28] https://news.cnrs.fr/synthetic-dna [29] https://news.cnrs.fr/nucleotides [30] https://news.cnrs.fr/coding [31] https://news.cnrs.fr/oligoarchive [32] https://news.cnrs.fr/sequencing [33] https://news.cnrs.fr/storage [34] https://news.cnrs.fr/data-0 [35] https://news.cnrs.fr/bits [36] https://news.cnrs.fr/transcoding [37] http://www.facebook.com/sharer/sharer.php?s=100&p%5Burl%5D=https%3A//news.cnrs.fr/print/1644&p%5Btitle%5D=Synthetic%20DNA%20holds%20great%20promise%20for%20data%20storage&p%5Bimages%5D%5B0%5D=&p%5Bsummary%5D= [38] https://news.cnrs.fr/printmail/1644