Machine Learning Transition Temperatures from 2D Structure

09 March 2020, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

A priori knowledge of melting and boiling could expedite the discovery of pharmaceutical, energetic, and energy harvesting materials. The tools of data science are becoming increasingly important for exploring chemical datasets and predicting material properties. A fundamental part of data-driven modeling is molecular featurization. Herein, we propose a molecular representation with group-constitutive and geometrical descriptors that map to enthalpy and entropy--two thermodynamic quantities that drive thermal phase transitions. The descriptors are inspired by the linear regression-based quantitative structure-property relationship of Yalkowsky and coworkers known as the Unified Physicochemical Property Estimation Relationships (UPPER). Combined with nonlinear machine learning (specifically, eXtreme Gradient Boosting or XGBoost), these concise and easy-to-compute descriptors provide an appealing framework for predicting transition enthalpies, entropies, and temperatures in a diverse chemical space. An application to energetic materials shows that UPPER plus XGBoost is predictive, despite a relatively modest energetics reference dataset. We also report results on public datasets of melting points (i.e., OCHEM, Enamine, Bradley, and Bergstrom). The newly proposed representation is determined purely from SMILES string, thus showing promise toward fast and accurate screening of thermodynamic properties.

Keywords

Machine Learning
Melting Point
Boiling Point
Enthalpy of Transition
Entropy of Transition
XGBoost
Molecular Featurization

Supplementary materials

Title
Description
Actions
Title
si
Description
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.