The amount of data generated in many applications such as astronomy and genomics has highlighted the growing need for compression schemes that allows to interact and manipulate data directly in the compressed domain.
Classical compression schemes such as Lempel-Ziv are suboptimal in this regard since the recovery of even a single message symbol requires us to decompress the entire source. Is it possible to design compression schemes that are both space-optimal (achieving compressed lengths close to entropy) and yet queries on the original text can be answered efficiently by probing a small number of compressed bits?
I will briefly introduce variations of this problem, and much of the talk will focus on the problem of providing random access (also called local decodability or locality) in the compressed domain.
Rather surprisingly, lossless compression of a random source can be performed with a notion of strong locality at rates close to entropy; any individual source symbol can be decoded from a constant (independent of the length of the source sequence, n) number of compressed bits, with a vanishing in n probability of error. I will briefly discuss some known results, and our recent work in this area. I will then talk about the distributed compression setting, and show that for two separately encoded correlated sources (a.k.a. the Slepian-Wolf setup), lossless compression and strong locality is generally not achievable.
This is based on joint work with Aslan Tchamkerten and Venkat Chandar.