TY - JOUR
T1 - High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
AU - Wang, Yixin
AU - Noor-A-Rahim, Md
AU - Zhang, Jingyun
AU - Gunawan, Erry
AU - Guan, Yong Liang
AU - Poh, Chueh Loo
N1 - Publisher Copyright:
© 2019 The Author(s).
PY - 2019/11/21
Y1 - 2019/11/21
N2 - Background: With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information capacity, we are inspired to explore a new coding system that facilitates the practical implementation of DNA data storage with high capacity. Result: In this work, we devised and implemented a DNA data storage scheme with variable-length oligonucleotides (oligos), where a hybrid DNA mapping scheme that converts digital data to DNA records is introduced. The encoded DNA oligos stores 1.98 bits per nucleotide (bits/nt) on average (approaching the upper bound of 2 bits/nt), while conforming to the biochemical constraints. Beyond that, an oligo-level repeat-accumulate coding scheme is employed for addressing data loss and corruption in the biochemical processes. With a wet-lab experiment, an error-free retrieval of 379.1 KB data with a minimum coverage of 10x is achieved, validating the error resilience of the proposed coding scheme. Along with that, the theoretical analysis shows that the proposed scheme exhibits a net information density (user bits per nucleotide) of 1.67 bits/nt while achieving 91% of the information capacity. Conclusion: To advance towards practical implementations of DNA storage, we proposed and tested a DNA data storage system enabling high potential mapping (bits to nucleotide conversion) scheme and low redundancy but highly efficient error correction code design. The advancement reported would move us closer to achieving a practical high-capacity DNA data storage system.
AB - Background: With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information capacity, we are inspired to explore a new coding system that facilitates the practical implementation of DNA data storage with high capacity. Result: In this work, we devised and implemented a DNA data storage scheme with variable-length oligonucleotides (oligos), where a hybrid DNA mapping scheme that converts digital data to DNA records is introduced. The encoded DNA oligos stores 1.98 bits per nucleotide (bits/nt) on average (approaching the upper bound of 2 bits/nt), while conforming to the biochemical constraints. Beyond that, an oligo-level repeat-accumulate coding scheme is employed for addressing data loss and corruption in the biochemical processes. With a wet-lab experiment, an error-free retrieval of 379.1 KB data with a minimum coverage of 10x is achieved, validating the error resilience of the proposed coding scheme. Along with that, the theoretical analysis shows that the proposed scheme exhibits a net information density (user bits per nucleotide) of 1.67 bits/nt while achieving 91% of the information capacity. Conclusion: To advance towards practical implementations of DNA storage, we proposed and tested a DNA data storage system enabling high potential mapping (bits to nucleotide conversion) scheme and low redundancy but highly efficient error correction code design. The advancement reported would move us closer to achieving a practical high-capacity DNA data storage system.
KW - DNA data storage
KW - Long term data storage
KW - Next-generation information storage
UR - https://www.scopus.com/pages/publications/85075609004
U2 - 10.1186/s13036-019-0211-2
DO - 10.1186/s13036-019-0211-2
M3 - Article
AN - SCOPUS:85075609004
SN - 1754-1611
VL - 13
JO - Journal of Biological Engineering
JF - Journal of Biological Engineering
IS - 1
M1 - 89
ER -