1 Efficient Computation, The Department2 Theoretical Computer Science, The Department3 Algorithms, Software & Systems, The Department
We consider compression of unordered sets of distinct elements. After a discus- sion of the general problem, we focus on compressing sets of fixed-length bitstrings in the presence of statistical information. We survey techniques from previous work, suggesting some adjustments, and propose a novel compression algorithm that allows transparent incorporation of various estimates for probability distribution. Our experimental results allow the conclusion that set compression can benefit from incorporat- ing statistics, using our method or variants of previously known techniques.