Simonsen, Martin5; Sand, Andreas6; Mailund, Thomas5; Pedersen, Christian Nørgaard Storm5
1 Bioinformatics Research Centre (BiRC), Faculty of Science, Aarhus University, Aarhus University2 Department of Computer Science, Faculty of Science, Aarhus University, Aarhus University3 Department of Computer Science, Science and Technology, Aarhus University4 Bioinformatics Research Centre (BiRC), Science and Technology, Aarhus University5 Bioinformatics Research Centre (BiRC), Science and Technology, Aarhus University6 Department of Computer Science, Science and Technology, Aarhus University
Bioinformatics focuses on developing computational methods for collecting, handling and analyzing biological data. Because the amount of data is often very large and the models used for analysis are complex, the demand for efficient methods exploiting modern computer hardware is increasing. In the past ten years we have been moving from single core to multi-core processor architectures, and recently Graphics Processing Units (GPUs) with hundreds of cores have also become available for general purpose computation. Making existing and new algorithms exploit these multi-core architectures is one of today’s major challenges in e.g. bioinformatics. At the Bioinformatics Research Center (BiRC), we are currently working on two projects involving the exploitation of modern multi-core architectures: 1. In molecular docking we aim to identify small proteins called ligands which show strong interaction with a target protein molecule. Such ligands potentially alters the function of the (disease causing) target protein and can therefore be used in a modern drug discovery process. Evaluation of ligands against a target molecule requires a moderate amount of computation, but we often need to evaluate thousands of ligands against several target molecules, requiring use of expensive computer clusters. By using GPUs to exploit the parallel nature of molecular docking, we reduce the hardware requirements of large scale molecular docking significantly. 2. Hidden Markov Models (HMMs) is a statistical tool used in a wide range of applications within bioinformatics for modeling sequential data assumed to originate from a Markov process; e.g. gene annotation, alignments and inferring coalescence processes among species. Because of their computational efficiency, HMMs are one of few methods used for genome wide analysis, where sequences often consist of millions of characters. Nevertheless analysis times are still often measured in days, weeks or months, and the models must therefore be kept relatively simple. By exploiting SSE instructions and the multi-core architecture of modern CPUs we decrease the running time of the analyses significantly and thereby make analyses of more complex models feasible.