Range reporting on categorical (or colored) data is a well-studied generalization of the classical range reporting problem in which each of the N input points has an associated color (category). A query then asks to report the set of colors of the points in a given rectangular query range, which may be far smaller than the set of all points in the query range. We study two-dimensional categorical range reporting in both the word-RAM and I/O-model. For the I/O-model, we present two alternative data structures for three-sided queries. The first answers queries in optimal O(lgB N + K/B) I/Os using O(N lg*N) space, where K is the number of distinct colors in the output, B is the disk block size, and lg*N is the iterated logarithm of N. Our second data structure uses linear space and answers queries in O(lg B N + lg(h) N + K/B) I/Os for any constant integer h ≥ 1. Here lg(1) N = lg N and lg(h) N = lg(lg(h-1) N) when h > 1. Both solutions use only comparisons on the coordinates. We also show that the lgB N terms in the query costs can be reduced to optimal lg lgB U when the input points lie on a U x U grid and we allow word-level manipulations of the coordinates. We further reduce the query time to just O(1) if the points are given on an N x N grid. Both solutions also lead to improved data structures for four-sided queries. For the word-RAM, we obtain optimal data structures for three-sided range reporting, as well as improved upper bounds for four-sided range reporting. Finally, we show a tight lower bound on one-dimensional categorical range counting using an elegant reduction from (standard) two-dimensional range counting.
Proceedings of the Annual Acm-siam Symposium on Discrete Algorithms, 2013, p. 265-276
Main Research Area:
Annual a C M - S I a M Symposium on Discrete Algorithms. Proceedings
ACM-SIAM symposium on Disrete AlgorithmsSymposium on Discrete Algorithms, 2013