## Improved Approximation Algorithms for Reconstructing the History of Tandem Repeats

In IEEE/ACM Transactions on Computational Bilogy and Bioinformatics, Vol. 6, No. 3, Suly-September 2009

Abstract:

**Some genetic diseases in human beings are dominated by short sequences repeated consec-
utively called tandem repeats**. Once a region containing tandem repeats is found, it is of great

interest to study the history of creating the repeats. The computational problem of reconstruct-

ing the duplication history of tandem repeats has been studied extensively in the literature.

Almost all previous studies focused on the simplest case where the size of each duplication block

is 1. Only recently we

**succeeded in giving the first polynomial-time approximation algorithm**

with a guaranteed ratio for a more general case where the size of each duplication block is at

most 2; the algorithm achieves a ratio of 6 and runs in O(n11) time. In this paper, we present

two new polynomial-time approximation algorithms for this more general case. One of them

achieves a ratio of 5 and runs in O(n9) time while the other achieves a ratio of 2:5 + epsilon for any

constant epsilon> 0 but runs slower.

Glossary:

*Tandem repeats*: Occur in DNA when a pattern of two or more nucleotides is repeated and the repetitions are directly adjacent to each other. An example would be: A-T-T-C-G-A-T-T-C-G-A-T-T-C-G in which the sequence A-T-T-C-G is repeated three times. Tandem repeat describes a pattern that helps determine an individual’s inherited traits. Tandem repeats can be very useful in determining parentage. Short tandem repeats are used for certain genealogical DNA tests. SOURCE: http://en.wikipedia.org/wiki/Tandem_repeat

leave a comment