Sequences encoding ribosomal Bacterial and Archeal genes are very similar among species of the same genus, in fact, in some cases, similarity is 99% or higher among the 1500 bp that compose the 16S rDNA. For this reason today it is still a challenge to gain the species level characterisation using 16S hypervariable regions, especially when working with the not high quality very short reads characteristics of next generation sequencers (Mande S.S. et al., 2012). Previous works analysed the microbial community composition in biogas reactors via 16S rDNA sequencing (Luo, G. et al., 2013; Werner, J.J. et al., 2011). For this reason we developed a bioinformatics strategy in order to create a tool to review the generated dataset and to obtain a more strict control on the bacterial composition at the species level, with estimation of its reliability. The program perform local similarity search and evaluate the results with high stringency (95 up to 100%) and returns all the possible candidate species with unique or multiple matches for each genus. In the process of species identification, different categories of reliability can be generated: certain can lead to univocal species identification even in the same genus, while others give multiple matches with the same probability. The software was used to analyse samples taken during the digestion process in three independent biogas reactors continuously fed with raw cattle manure. Among the most represented (>1%) considering the relative abundance of the community Clostridium resulted to be the most complex genus to elucidate. Some species in this genus, Clostridium ultunense and Clostridium irregular, have been assigned with high probability (100% and 99.7% of unique matches) while other 11 have only few unique matches (0.1 to 10%). Bacteroides, Acetobacterium and Pseudomonas genera had difficulties in the assignment, gathering medium-low probability as well (1 to 50%). On the contrary several other genera were assigned with high probability and no multiple matches. Some of them including only one species uniquely identified, some other including more than one (i.e. Dialister succinatiphilus with 37% and Dialister propionicifaciens with 63%, Tissierella praeacuta with 26% and Tissierella creatinophila with 74%, Proteiniphilum acetatigenes with 100%, Halothermothrix orenii with 100%, Thermoflavimicrobium dichotomicum with 100%). Furthermore comparative analyses with MG-RAST (Meyer F. et al., 2008) results have been performed to test our strategy. We also found that our method can be used to understand which hypervariable region of 16S rDNA is more efficient in the identification at the species level in different genera. Our conclusion is that the identification at the species level remains a challenge of major interest but it can be done reliably for specific genera. In fact we uniquely identified the species of up to 67% of the most abundant genera and we obtained a less reliable identification for the remaining 33%.
Proceedings of the 2nd International Conference on Biogas Microbiology, 2014
Main Research Area:
2nd International Conference on Biogas Microbiology, 2014