Abstract
Accurate estimation of the pKas of cysteine residues in proteins could inform targeted approaches in hit discovery. The pKa of a targetable cysteine residue in a disease-related protein is an important physiochemical parameter in covalent drug dis- covery, as it influences the fraction of nucleophilic thiolate amenable to chemical protein modification. Traditional structure-based in silico tools are limited in their predictive accuracy of cysteine pKas relative to other titratable residues. Additionally, there are limited comprehensive benchmark assessments for cysteine pKa predictive tools. This raises the need for extensive assessment and evaluation of methods for cysteine pKa prediction. Here, we report the performance of several computational pKa methods, including single structure and ensemble-based approaches, on a diverse test set of experimental cysteine pKas retrieved from the PKAD data- base. The dataset consisted of 16 wildtype and 10 mutant proteins with experimentally measured cysteine pKa values. Our results highlight that these methods are varied in their overall predictive accuracies. Among the test set of wildtype proteins evaluated, the best method yielded a mean absolute error of 2.3 pK units highlighting the need for improvement of existing pKa methods for accurate cysteine pKa estimation. Given the limited accuracy of these methods, further development is needed before these approaches can be routinely employed to drive design decisions in early drug discovery efforts.
Supplementary materials
Title
Supplementary Material
Description
Summary of the computed pKa values for Cys residues in the proteins studied, details of the pKa calculations performed, including description of the constant pH molecular dynamics simulations and Amber-TI pKa calculations and input parameters.
Actions