1 Administration, Department of Chemistry, Faculty of Science, Københavns Universitet2 Université de Lorraine3 Administration, Department of Chemistry, Faculty of Science, Københavns Universitet
In a first step toward the development of an efficient and accurate protocol to estimate amino acids' pKa's in proteins, we present in this work how to reproduce the pKa's of alcohol and thiol based residues (namely tyrosine, serine, and cysteine) in aqueous solution from the knowledge of the experimental pKa's of phenols, alcohols, and thiols. Our protocol is based on the linear relationship between computed atomic charges of the anionic form of the molecules (being either phenolates, alkoxides, or thiolates) and their respective experimental pKa values. It is tested with different environment approaches (gas phase or continuum solvent-based approaches), with five distinct atomic charge models (Mulliken, Löwdin, NPA, Merz-Kollman, and CHelpG), and with nine different DFT functionals combined with 16 different basis sets. Moreover, the capability of semiempirical methods (AM1, RM1, PM3, and PM6) to also predict pKa's of thiols, phenols, and alcohols is analyzed. From our benchmarks, the best combination to reproduce experimental pKa's is to compute NPA atomic charge using the CPCM model at the B3LYP/3-21G and M062X/6-311G levels for alcohols (R(2) = 0.995) and thiols (R(2) = 0.986), respectively. The applicability of the suggested protocol is tested with tyrosine and cysteine amino acids, and precise pKa predictions are obtained. The stability of the amino acid pKa's with respect to geometrical changes is also tested by MM-MD and DFT-MD calculations. Considering its strong accuracy and its high computational efficiency, these pKa prediction calculations using atomic charges indicate a promising method for predicting amino acids' pKa in a protein environment.
Journal of Chemical Information and Modeling, 2014, Vol 54, Issue 8, p. 2200-2213