Sublinear alignment
Submitted by aml on Sun, 12/16/2007 - 09:24.
02/27/2004 - 16:30
02/27/2004 - 17:30
Etc/GMT
Classical algorithms for computing the similarity between two sequences use a dynamic programming matrix, and compares two strings of size n in O(n2) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global similarity computations. The speed-up is achieved by using the Lempel-Ziv parsing of both strings. It leads to an O(h n2 / logn) algorithm for an input on a constant-size alphabet, where h < 1 is the entropy of strings.
» Array Array