INESC-ID   Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technology from seed


Knowledge Discovery and Bioinformatics
Inesc-ID Lisboa

Dynamic Entropy-Compressed Sequences and Applications

10/09/2006 - 16:00
10/09/2006 - 17:00

Data structures are called succinct when they take little space (meaning usually of lower order) compared to the data they give access to. A more ambitious challenge is that of compressed data structures, which aim at operating within space proportional to that of the compressed data they give access to. Designing compressed data structures goes beyond compression in the sense that the data must be manageable in compressed form without first decompressing it. This is a trend that has gained much attention in recent years. In this talk we will introduce a simple data structure for managing bit sequences, so that the space required is essentially that of the zero-order entropy of the sequence, and the operations of inserting/deleting bits, accessing a bit position, and computing rank/select over the sequence, can all be done in logarithmic time. Rank operation gives the number of 1 (or 0) bits up to a given position, whereas select gives the position of the j-th 1 (or 0) bit in the sequence. This basic result has a surprising number of consequences. We show how it permits obtaining novel solutions to the dynamic partial sums with indels problem, dynamic wavelet trees, and dynamic compressed full-text indexes.