Physics-Informed Machine Learning Enables Rapid Macroscopic pKa Prediction

25 April 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Accurate prediction of macroscopic pKa values remains a central challenge in computational chemistry, critical for modeling pH-dependent properties like solubility, membrane permeability, and charge state. Here we introduce Starling, a physics-informed neural network based on the Uni-pKa architecture trained to predict per-microstate free energies and compute macroscopic pKavalues via thermodynamic ensemble modeling. Unlike approaches that treat protonation events in isolation, Starling explicitly resolves protonation and tautomeric microstates, enabling robust handling of complex molecules with multiple ionizable sites. We show that Starling achieves comparable or superior accuracy to leading commercial tools on multiple benchmark datasets, and demonstrate its utility in predicting isoelectric points, logD profiles, and blood–brain-barrier permeability. By maintaining thermodynamic consistency and enabling rapid microstate-ensemble generation, Starling enables accurate physicochemical property prediction with broad relevance to drug discovery and molecular design.

Supplementary materials

Title
Description
Actions
Title
Supplementary Data
Description
Underlying data for Desantis PROTAC pKa prediction, amino-acid isoelectric points, and Kpuu prediction.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.