Caro, Albert Pallejà1; García-Vallvé, Santiago4; Romeu, Antoni4
1 Department of Systems Biology, Technical University of Denmark2 Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark3 Universitat Rovira i Virgili4 Universitat Rovira i Virgili
ABSTRACT: BACKGROUND: In prokaryote genomes most of the co-directional genes are in close proximity. Even the coding sequence or the stop codon of a gene can overlap with the Shine-Dalgarno (SD) sequence of the downstream co-directional gene. In this paper we analyze how the presence of SD may influence the stop codon usage or the spacing lengths between co-directional genes. RESULTS: The SD sequences for 530 prokaryote genomes have been predicted using computer calculations of the base-pairing free energy between translation initiation regions and the 16S rRNA 3' tail. Genomes with a large number of genes with the SD sequence concentrate this regulatory motif from 4 to 11 bps before the start codon. However, not all genes seem to have the SD sequence. Genes separated from 1 to 4 bps from a co-directional upstream gene show a high SD presence, though this regulatory signal is located towards the 3' end of the coding sequence of the upstream gene. Genes separated from 9 to 15 bps show the highest SD presence as they accommodate the SD sequence within an intergenic region. However, genes separated from around 5 to 8 bps have a lower percentage of SD presence and when the SD is present, the stop codon usage of the upstream gene changes to accommodate the overlap between the SD sequence and the stop codon. CONCLUSION: The SD presence makes the intergenic lengths from 5 to 8 bps less frequent and causes an adaptation of the stop codon usage. Our results introduce new elements to the discussion of which factors affect the intergenic lengths, which cannot be totally explained by the pressure to compact the prokaryote genomes.