Metagenomic sequencing is a fundamental tool to identify the functional potential of the prokaryotic species present in microbial communities, particularly for the unculturable microbes. Recent advances in software dedicated to metagenomic assembly allow nowadays to generate collections of scaffolds comprehensive of thousands genome sequences, but the binning of these scaffolds into OTUs representative of microbial genomes is still challenging. In the attempt to obtain a deep characterization of the anaerobic digestion microbiome, different metagenomic binning approaches were integrated into a new tool. To facilitate the binning process, this tool integrates two strategies; the taxonomic assignment of scaffolds and the clustering based on coverage values. By applying this procedure, 373 high quality genomes involved in the anaerobic digestion process have been extracted and annotated using COG, KEGG, SEED and Pfam. These high throughput approaches pose nowadays other basic challenges related to the computational effort needed for the taxonomic assignment of hundreds new microbial genomes. It is also mandatory to verify if other DNA sequences deriving from the same species are already present in public databases. Metagenomics raise new fundamental questions regarding the definition of what a microbial species is and how it can be defined solely considering its genome. In order to address these issues we have developed a collection of scripts to check the presence of the same genome sequence not only in different assemblies, but also in public databases and, finally, to simplify its functional annotation.
Main Research Area:
16th International Symposium on Microbial Ecology (ISME16), 2016