1 Department of Computer Science, Science and Technology, Aarhus University2 Department of Computer Science - Center for Massive Data Algoritmics, Department of Computer Science, Science and Technology, Aarhus University3 Department of Computer Science, Science and Technology, Aarhus University
n many scientific applications it is required to reconstruct a raster dataset many times, each time using a different resolution. This leads to the following problem; let G be a raster of N−−√ x N−−√ cells. We want to compute for every integer 2 ≤μ≤N−−√ a raster Gμ of [ N−−√/μ ] x [ N−−√/μ ] cells where each cell of Gμ stores the average of the values of μ x μ cells of G . Here we consider the case where G is so large that it does not fit in the main memory of the computer. We present a novel algorithm that solves this problem in O(scan(N)) data block transfers from/to the external memory, and in θ(N) CPU operations; here scan(N) is the number of block transfers that are needed to read the entire dataset from the external memory. Unlike previous results on this problem, our algorithm achieves this optimal performance without making any assumptions on the size of the main memory of the computer. Moreover, this algorithm is cache-oblivious; its performance does not depend on the data block size and the main memory size. We have implemented the new algorithm and we evaluate its performance on datasets of various sizes; we show that it clearly outperforms previous approaches on this problem. In this way, we provide solid evidence that non-trivial cache-oblivious algorithms can be implemented so that they perform efficiently in practice.
Lecture Notes in Computer Science: 21st Annual European Symposium, Sophia Antipolis, France, September 2-4, 2013. Proceedings, 2013, p. 61-72
Main Research Area:
Lecture Notes in Computer Science
European Symposium on AlgorithmsEuropean Symposium on Algorithms, 2013