Abstract
The solubility of organic molecules is crucial in organic synthesis and industrial chemistry, it is important in the design of many phase separation and purification units, and it controls the migration of many species into the environment. To decide which solvents and temperatures can be used in the design of new processes, trial and error is often used, as the choice is restricted by unknown solid solubility limits. Here we present a fast and convenient computational method for estimating the solubility of solid neutral organic molecules in water and many organic solvents for a broad range of temperatures. The model is developed by combining fundamental thermodynamic equations with machine learning models for solvation free energy, solvation enthalpy, Abraham solute parameters, and aqueous solid solubility at 298K. We provide free open-source and online tools for the prediction of solid solubility limits and a curated data collection (SolProp) that includes more than 5,000 experimental solid solubility values for validation of the model. The model predictions are accurate for aqueous systems and for a huge range of organic solvents up to 550K or higher. Methods to further improve solid solubility predictions by providing experimental data on the solute of interest in another solvent, or on the solute’s sublimation enthalpy, are also presented.
Supplementary materials
Title
Supporting Information
Description
Additional information on the construction of models and datasets. Validation of the different models against more experimental data.
Actions
Title
SolProp
Description
Experimental data and machine learning models.
Actions
Supplementary weblinks
Title
RMG Solubility Prediction
Description
Web interface to use the models for predicting solubility limits.
Actions
View Title
SolProp data
Description
The SolProp dataset, including experimental data and machine learning models
Actions
View Title
Github Source Code
Description
The code used for training the models and making the model predictions for solubility. Does not include the trained machine learning models.
Example files are provided to demonstrate how to predict solubility and other properties.
Actions
View Title
Conda package
Description
A conda package with the compiled source code including the trained machine learning models. Example notebook scripts on how to use the package are available on Github.
Actions
View