1 Department of Physics, Technical University of Denmark2 Theoretical Atomic-scale Physics, Department of Physics, Technical University of Denmark
The ongoing growth in computing power enables researchers to perform such a large number of simulations that cannot be analyzed with paper and pencil any more. Simple approaches of processing data: ordering the calculations in directories and using a script to create a spreadsheet or a small database have to be redesigned for every new project. Sharing intermediate data with collaborators can be cumbersome and when publishing on the Internet specially tailored infrastructure has to be set up. Due to the diverse and changing landscape of electronic structure codes and methods there is no unique way of storing, collecting and presenting results. However there are many partial solutions: VMDF (paper D) a tool to filter and analyze aggregated sets of electronic structure data presents a first step towards user-friendly analysis of data. The Inorganic Crystal Structure Database ICSD[1, 2], collects very specific data and makes it accessible through a web interface; AflowLib (Ab-initio Electronic Structure Library)  provides access to structure properties of many compounds on the Internet.What is missing is a system that is Open Source Software, generic enough to support different codes, different abstraction levels and enables users to analyze their own results, and allows to share data with collaborators. The approach of the Computational Materials Repository (CMR) is to convert data to an internal format that maintains the original variable names without insisting on any semantics. Imported data can be implicitly grouped by user criteria and therefore maintain their natural connection in the database as well. Automatic data analysis is enabled through agents that analyze and group data based on predefined rules. Small projects can be handled without the need of database software while bigger projects one can use to improve performance. CMR enables one to create templates for the collection and analysis of data independently of the electronic structure code, simplifies screenings involving a lot of calculations, allows one to perform automatic analysis of data based on taxonomy, tags and keywords, provides the ability to share data with collaborators and maintains the link from the derived to the original data.
Main Research Area:
Technical University of Denmark, Center for Atomic-Scale Materials Physics, 2012