Abstract
The solubility of organic molecules is crucial in organic synthesis and industrial chemistry, it is important in the design of many phase separation and purification units, and it controls the migration of many species into the environment. Solubility limits force chemists and engineers to change the solvent or adjust the temperature of their process, often by trial and error. Here we present a fast and convenient computational method for estimating the solubility of any neutral organic molecule in water and any organic solvent at any temperature. The model is developed by combining fundamental thermodynamic equations with machine learning models for solvation free energy, solvation enthalpy, Abraham solute parameters, and aqueous solubility at 298K. We provide free open-source and online tools for the prediction of solubility limits and a curated data collection (SolProp) that includes more than 5,000 experimental solubility values for validation of the model. The model predictions are accurate for aqueous systems and for a huge range of organic solvents up to 550K or higher. Methods to further improve solubility predictions by providing experimental data on the solute of interest in any solvent at any temperature, or on the solute’s sublimation enthalpy, are also presented.
Supplementary materials
Title
Supporting Information
Description
Additional information on the construction of models and datasets. Validation of the different models against more experimental data.
Actions
Title
SolProp
Description
Experimental data and machine learning models.
Actions
Supplementary weblinks
Title
RMG Solubility Prediction
Description
Web interface to use the models for predicting solubility limits.
Actions
View Title
SolProp data
Description
The SolProp dataset, including experimental data and machine learning models
Actions
View Title
Github Source Code
Description
The code used for training the models and making the model predictions for solubility. Does not include the trained machine learning models.
Example files are provided to demonstrate how to predict solubility and other properties.
Actions
View Title
Conda package
Description
A conda package with the compiled source code including the trained machine learning models. Example notebook scripts on how to use the package are available on Github.
Actions
View