Undergraduate ProjectsProject Details

Almost-Balanced and Maximum Homopolymer-Run Restricted Codes for Data Storage in DNA

Student[s]:
Victoria Goldin, Gili Doweck

Our goal is to describe an encoder-decoder for storing data in DNA where:

(1) The GC-AT content is at most 5% imbalanced.

(2) Homopolymer-runs are at most of length 3.

(3) The targets are encoded quaternary strands of length 100-500 nucleotides.

(4) The encoder-decoder should work at linear time (at most).

(5) Demands reasonable memory requirements.

Additionally, we assume that there will be a singular output length for every input length. In algorithms where the output length varies, we chose the worst-case output length.

Awards:

  • Best Project CS Faculty (2022)